Backdoor Attack on Unpaired Medical Image-Text Foundation Models: A Pilot Study on MedCLIP (2401.01911v1)
Abstract: In recent years, foundation models (FMs) have solidified their role as cornerstone advancements in the deep learning domain. By extracting intricate patterns from vast datasets, these models consistently achieve state-of-the-art results across a spectrum of downstream tasks, all without necessitating extensive computational resources. Notably, MedCLIP, a vision-language contrastive learning-based medical FM, has been designed using unpaired image-text training. While the medical domain has often adopted unpaired training to amplify data, the exploration of potential security concerns linked to this approach hasn't kept pace with its practical usage. Notably, the augmentation capabilities inherent in unpaired training also indicate that minor label discrepancies can result in significant model deviations. In this study, we frame this label discrepancy as a backdoor attack problem. We further analyze its impact on medical FMs throughout the FM supply chain. Our evaluation primarily revolves around MedCLIP, emblematic of medical FM employing the unpaired strategy. We begin with an exploration of vulnerabilities in MedCLIP stemming from unpaired image-text matching, termed BadMatch. BadMatch is achieved using a modest set of wrongly labeled data. Subsequently, we disrupt MedCLIP's contrastive learning through BadDist-assisted BadMatch by introducing a Bad-Distance between the embeddings of clean and poisoned data. Additionally, combined with BadMatch and BadDist, the attacking pipeline consistently fends off backdoor assaults across diverse model designs, datasets, and triggers. Also, our findings reveal that current defense strategies are insufficient in detecting these latent threats in medical FMs' supply chains.
- A survey of vision-language pre-trained models. arXiv preprint arXiv:2202.10936, 2022.
- Medclip: Contrastive learning from unpaired medical images and text. arXiv preprint arXiv:2210.10163, 2022.
- Self-supervised image-text pre-training with mixed data in chest x-rays. arXiv preprint arXiv:2103.16022, 2021.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- Deep ct to mr synthesis using paired and unpaired data. Sensors, 19(10):2361, 2019.
- Semi-supervised unpaired medical image segmentation through task-affinity consistency. IEEE Transactions on Medical Imaging, 42(3):594–605, 2022.
- Unpaired multi-modal segmentation via knowledge distillation. IEEE transactions on medical imaging, 39(7):2415–2425, 2020.
- Deep learning with noisy labels: Exploring techniques and remedies in medical image analysis. Medical image analysis, 65:101759, 2020.
- Backdoor learning: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2022.
- Backdoor attack and defense in federated generative adversarial network-based medical image synthesis. Medical Image Analysis, page 102965, 2023.
- Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 11957–11965, 2020.
- Kvasir: A multi-class image dataset for computer aided gastrointestinal disease detection. In Proceedings of the 8th ACM on Multimedia Systems Conference, pages 164–169, 2017.
- Exploring the effect of image enhancement techniques on covid-19 detection using chest x-ray images. Computers in biology and medicine, 132:104319, 2021.
- Backdoor attacks in the supply chain of masked image modeling. arXiv preprint arXiv:2210.01632, 2022.
- Badencoder: Backdoor attacks to pre-trained encoders in self-supervised learning. In 2022 IEEE Symposium on Security and Privacy (SP), pages 2043–2059. IEEE, 2022.
- How to backdoor diffusion models? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4015–4024, 2023.
- How much can clip benefit vision-and-language tasks? arXiv preprint arXiv:2107.06383, 2021.
- Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 590–597, 2019.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- Strip: A defence against trojan attacks on deep neural networks. In Proceedings of the 35th Annual Computer Security Applications Conference, pages 113–125, 2019.
- Fine-pruning: Defending against backdooring attacks on deep neural networks. In International symposium on research in attacks, intrusions, and defenses, pages 273–294. Springer, 2018.
- Rethinking the trigger of backdoor attack. arXiv preprint arXiv:2004.04692, 2020.
- Deepsweep: An evaluation framework for mitigating dnn backdoor attacks using data augmentation. In Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, pages 363–377, 2021.
- Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In 2019 IEEE Symposium on Security and Privacy (SP), pages 707–723. IEEE, 2019.
- Detecting ai trojans using meta neural analysis. In 2021 IEEE Symposium on Security and Privacy (SP), pages 103–120. IEEE, 2021.
- Certified defenses for adversarial patches. arXiv preprint arXiv:2003.06693, 2020.
- (de) randomized smoothing for certifiable defense against patch attacks. Advances in Neural Information Processing Systems, 33:6465–6475, 2020.
- Efficient certified defenses against patch attacks on image classifiers. In International Conference on Learning Representations, 2020.
- Patchguard: A provably robust defense against adversarial patches via small receptive fields and masking. arXiv preprint arXiv:2005.10884, 2020.
- Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports. Scientific data, 6(1):317, 2019.
- Augmenting the national institutes of health chest radiograph dataset with expert annotations of possible pneumonia. Radiology: Artificial Intelligence, 1(1):e180041, 2019.
- A survey on neural trojans. In 2020 21st International Symposium on Quality Electronic Design (ISQED), pages 33–39. IEEE, 2020.
- Rethinking the backdoor attacks’ triggers: A frequency perspective. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16473–16481, 2021.
- Fiba: Frequency-injection based backdoor attack in medical image analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20876–20885, 2022.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
- Ruinan Jin (19 papers)
- Chun-Yin Huang (6 papers)
- Chenyu You (66 papers)
- Xiaoxiao Li (144 papers)