BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records (2407.05213v1)

Published 6 Jul 2024 in cs.CL and cs.AI

Abstract: The advent of clinical LLMs integrated into electronic health records (EHR) for clinical decision support has marked a significant advancement, leveraging the depth of clinical notes for improved decision-making. Despite their success, the potential vulnerabilities of these models remain largely unexplored. This paper delves into the realm of backdoor attacks on clinical LLMs, introducing an innovative attention-based backdoor attack method, BadCLM (Bad Clinical LLMs). This technique clandestinely embeds a backdoor within the models, causing them to produce incorrect predictions when a pre-defined trigger is present in inputs, while functioning accurately otherwise. We demonstrate the efficacy of BadCLM through an in-hospital mortality prediction task with MIMIC III dataset, showcasing its potential to compromise model integrity. Our findings illuminate a significant security risk in clinical decision support systems and pave the way for future endeavors in fortifying clinical LLMs against such vulnerabilities.

Citations (4)

View on Semantic Scholar

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records (2407.05213v1)

Collections

Summary

Paper Prompts

Follow-up Questions

Authors (4)

Tweets

Don't miss out on important new AI/ML research

BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records (2407.05213v1)

Collections

Summary

Paper Prompts

Follow-up Questions

Related Papers

Authors (4)

Tweets

Don't miss out on important new AI/ML research