MedBlindTuner: Towards Privacy-preserving Fine-tuning on Biomedical Images with Transformers and Fully Homomorphic Encryption (2401.09604v1)
Abstract: Advancements in ML have significantly revolutionized medical image analysis, prompting hospitals to rely on external ML services. However, the exchange of sensitive patient data, such as chest X-rays, poses inherent privacy risks when shared with third parties. Addressing this concern, we propose MedBlindTuner, a privacy-preserving framework leveraging fully homomorphic encryption (FHE) and a data-efficient image transformer (DEiT). MedBlindTuner enables the training of ML models exclusively on FHE-encrypted medical images. Our experimental evaluation demonstrates that MedBlindTuner achieves comparable accuracy to models trained on non-encrypted images, offering a secure solution for outsourcing ML computations while preserving patient data privacy. To the best of our knowledge, this is the first work that uses data-efficient image transformers and fully homomorphic encryption in this domain.
- Recognition of peripheral blood cell images using convolutional neural networks. Computer methods and programs in biomedicine, 180: 105020.
- A dataset of microscopic peripheral blood cell images for development of automatic recognition systems. Data in brief, 30.
- The liver tumor segmentation benchmark (lits). Medical Image Analysis, 84: 102680.
- (Leveled) fully homomorphic encryption without bootstrapping. ACM Transactions on Computation Theory (TOCT), 6(3): 1–36.
- Bootstrapping for approximate homomorphic encryption. In Advances in Cryptology–EUROCRYPT 2018: 37th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Tel Aviv, Israel, April 29-May 3, 2018 Proceedings, Part I 37, 360–384. Springer.
- Homomorphic encryption for arithmetic of approximate numbers. In Advances in Cryptology–ASIACRYPT 2017: 23rd International Conference on the Theory and Applications of Cryptology and Information Security, Hong Kong, China, December 3-7, 2017, Proceedings, Part I 23, 409–437. Springer.
- Practical FHE parameters against lattice attacks. Cryptology ePrint Archive.
- Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic). arXiv preprint arXiv:1902.03368.
- Creeger, M. 2022. The Rise of Fully Homomorphic Encryption: Often called the Holy Grail of cryptography, commercial FHE is near. Queue, 20(4): 39–60.
- Crockett, E. 2020. A low-depth homomorphic circuit for logistic regression model training. Cryptology ePrint Archive.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255. Ieee.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
- Gentry, C. 2009. A fully homomorphic encryption scheme. Stanford university.
- Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy. In International conference on machine learning, 201–210. PMLR.
- ML confidential: Machine learning on encrypted data. In International Conference on Information Security and Cryptology, 1–21. Springer.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
- Privacy-preserving machine learning as a service. Proc. Priv. Enhancing Technol., 2018(3): 123–142.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11): 2278–2324.
- HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption.
- When machine learning meets privacy: A survey and outlook. ACM Computing Surveys (CSUR), 54(2): 1–36.
- Glyph: Fast and accurately training deep neural networks on encrypted data. Advances in neural information processing systems, 33: 9193–9202.
- Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE Symposium on Security and Privacy (SP), 19–38. IEEE.
- Towards deep neural network training on encrypted data. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 0–0.
- Nesterov, Y. E. 1983. A method of solving a convex programming problem with convergence rate O\\\backslash\bigl(k^2\\\backslash\bigr). In Doklady Akademii Nauk, volume 269, 543–547. Russian Academy of Sciences.
- A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10): 1345–1359.
- Cheetah: Optimizing and accelerating homomorphic encryption for private inference. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 26–39. IEEE.
- Vision Transformer for Efficient Chest X-ray and Gastrointestinal Image Classification. arXiv preprint arXiv:2304.11529.
- Mlaas: Machine learning as a service. In 2015 IEEE 14th international conference on machine learning and applications (ICMLA), 896–902. IEEE.
- Deepsecure: Scalable provably-secure deep learning. In Proceedings of the 55th Annual Design Automation Conference, 1–6.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
- Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, 6105–6114. PMLR.
- Training data-efficient image transformers & distillation through attention. In International conference on machine learning, 10347–10357. PMLR.
- Tschandl, P. 2018. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions.
- The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5(1): 1–9.
- Attention is all you need. Advances in neural information processing systems, 30.
- Securenn: 3-party secure computation for neural network training. Proceedings on Privacy Enhancing Technologies, 2019(3): 26–49.
- A survey of transfer learning. Journal of Big data, 3(1): 1–40.
- MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification. Scientific Data, 10(1): 41.