DataCook: Crafting Anti-Adversarial Examples for Healthcare Data Copyright Protection (2403.17755v1)
Abstract: In the realm of healthcare, the challenges of copyright protection and unauthorized third-party misuse are increasingly significant. Traditional methods for data copyright protection are applied prior to data distribution, implying that models trained on these data become uncontrollable. This paper introduces a novel approach, named DataCook, designed to safeguard the copyright of healthcare data during the deployment phase. DataCook operates by "cooking" the raw data before distribution, enabling the development of models that perform normally on this processed data. However, during the deployment phase, the original test data must be also "cooked" through DataCook to ensure normal model performance. This process grants copyright holders control over authorization during the deployment phase. The mechanism behind DataCook is by crafting anti-adversarial examples (AntiAdv), which are designed to enhance model confidence, as opposed to standard adversarial examples (Adv) that aim to confuse models. Similar to Adv, AntiAdv introduces imperceptible perturbations, ensuring that the data processed by DataCook remains easily understandable. We conducted extensive experiments on MedMNIST datasets, encompassing both 2D/3D data and the high-resolution variants. The outcomes indicate that DataCook effectively meets its objectives, preventing models trained on AntiAdv from analyzing unauthorized data effectively, without compromising the validity and accuracy of the data in legitimate scenarios. Code and data are available at https://github.com/MedMNIST/DataCook.
- A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” vol. 32, 2019.
- J. Yang, Y. Jiang, X. Huang, B. Ni, and C. Zhao, “Learning black-box attackers with transferable priors and query feedback,” vol. 33, pp. 12 288–12 299, 2020.
- J. Ma, Y. Zhang, S. Gu, C. Zhu, C. Ge, Y. Zhang, X. An, C. Wang, Q. Wang, X. Liu, S. Cao, Q. Zhang, S. Liu, Y. Wang, Y. Li, J. He, and X. Yang, “Abdomenct-1k: Is abdominal organ segmentation a solved problem?” 2021.
- J. Yang, X. Huang, Y. He, J. Xu, C. Yang, G. Xu, and B. Ni, “Reinventing 2d convolutions for 3d images,” vol. 25, no. 8, pp. 3009–3018, 2021.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2016, pp. 770–778.
- Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,” 2022, pp. 11 976–11 986.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in International conference on learning representations, 2015.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” International Conference on Learning Representations, 2021.
- J. Yang, R. Shi, D. Wei, Z. Liu, L. Zhao, B. Ke, H. Pfister, and B. Ni, “Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,” Scientific Data, vol. 10, no. 1, p. 41, 2023.
- J. Yang, R. Shi, and B. Ni, “Medmnist classification decathlon: A lightweight automl benchmark for medical image analysis,” in IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021, pp. 191–195.
- J. N. Kather, J. Krisam et al., “Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study,” PLOS Medicine, vol. 16, no. 1, pp. 1–22, 01 2019.
- X. Wang, Y. Peng et al., “Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases,” in CVPR, 2017, pp. 3462–3471.
- P. Tschandl, C. Rosendahl, and H. Kittler, “The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions,” Scientific data, p. 180161, 2018.
- N. Codella, V. Rotemberg, P. Tschandl, M. E. Celebi, S. Dusza, D. Gutman, B. Helba, A. Kalloo, K. Liopyris, M. Marchetti et al., “Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic),” arXiv preprint arXiv:1902.03368, 2019.
- D. S. Kermany, M. Goldbaum et al., “Identifying medical diagnoses and treatable diseases by image-based deep learning,” Cell, vol. 172, no. 5, pp. 1122 – 1131.e9, 2018.
- W. Al-Dhabyani, M. Gomaa, H. Khaled, and A. Fahmy, “Dataset of breast ultrasound images,” Data in Brief, vol. 28, p. 104863, 2020.
- A. Acevedo, A. Merino, S. Alférez, Ángel Molina, L. Boldú, and J. Rodellar, “A dataset of microscopic peripheral blood cell images for development of automatic recognition systems,” Data in Brief, vol. 30, p. 105474, 2020.
- V. Ljosa, K. L. Sokolnicki, and A. E. Carpenter, “Annotated high-throughput microscopy image sets for validation.” Nature methods, vol. 9, no. 7, pp. 637–637, 2012.
- P. Bilic, P. F. Christ et al., “The liver tumor segmentation benchmark (lits),” CoRR, vol. abs/1901.04056, 2019.
- X. Xu, F. Zhou et al., “Efficient multiple organ localization in ct image using 3d region proposal network,” IEEE Transactions on Medical Imaging, vol. 38, no. 8, pp. 1885–1898, 2019.
- S. G. Armato III, G. McLennan, L. Bidaut, M. F. McNitt-Gray, C. R. Meyer, A. P. Reeves, B. Zhao, D. R. Aberle, C. I. Henschke, E. A. Hoffman, E. A. Kazerooni, H. MacMahon, E. J. R. van Beek, D. Yankelevitz, A. M. Biancardi, P. H. Bland, M. S. Brown, R. M. Engelmann, G. E. Laderach, D. Max, R. C. Pais, D. P.-Y. Qing, R. Y. Roberts, A. R. Smith, A. Starkey, P. Batra, P. Caligiuri, A. Farooqi, G. W. Gladish, C. M. Jude, R. F. Munden, I. Petkovska, L. E. Quint, L. H. Schwartz, B. Sundaram, L. E. Dodd, C. Fenimore, D. Gur, N. Petrick, J. Freymann, J. Kirby, B. Hughes, A. Vande Casteele, S. Gupte, M. Sallam, M. D. Heath, M. H. Kuhn, E. Dharaiya, R. Burns, D. S. Fryd, M. Salganicoff, V. Anand, U. Shreter, S. Vastagh, B. Y. Croft, and L. P. Clarke, “The lung image database consortium (lidc) and image database resource initiative (idri): A completed reference database of lung nodules on ct scans,” Medical Physics, vol. 38, no. 2, pp. 915–931, 2011.
- L. Jin, J. Yang, K. Kuang, B. Ni, Y. Gao, Y. Sun, P. Gao, W. Ma, M. Tan, H. Kang, J. Chen, and M. Li, “Deep-learning-assisted detection and segmentation of rib fractures from ct scans: Development and validation of fracnet,” EBioMedicine, vol. 62, p. 103106, 2020.
- X. Yang, D. Xia, T. Kin, and T. Igarashi, “Intra: 3d intracranial aneurysm dataset for deep learning,” in CVPR, June 2020.
- I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the importance of initialization and momentum in deep learning,” in Proceedings of the 30th International Conference on Machine Learning (ICML-13), 2013, pp. 1139–1147.
- R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” in IJCAI, vol. 14, no. 2, 1995, pp. 1137–1145.
- A. P. Bradley, “The use of the area under the roc curve in the evaluation of machine learning algorithms,” Pattern Recognition, vol. 30, pp. 1145–1159, 1997.
- I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014.
- A. Ilyas, S. Santurkar, D. Tsipras, L. Engstrom, B. Tran, and A. Madry, “Adversarial examples are not bugs, they are features,” in Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc., 2019.
- Q. Wang, F. Liu, Y. Zhang, J. Zhang, C. Gong, T. Liu, and B. Han, “Watermarking for out-of-distribution detection,” Advances in Neural Information Processing Systems, vol. 35, pp. 15 545–15 557, 2022.
- S. Liang, Y. Li, and R. Srikant, “Enhancing the reliability of out-of-distribution image detection in neural networks,” arXiv preprint arXiv:1706.02690, 2017.
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
- T. N. Le, T. Gu, H. H. Nguyen, and I. Echizen, “Rethinking adversarial examples for location privacy protection,” in 2022 IEEE International Workshop on Information Forensics and Security (WIFS). IEEE, December 2022, pp. 1–6.
- J. Liu, C. P. Lau, and R. Chellappa, “Diffprotect: Generate adversarial examples with diffusion models for facial privacy protection,” arXiv preprint arXiv:2305.13625, 2023.
- J. Ballé, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,” arXiv preprint arXiv:1802.01436, 2018.
- D. Minnen, J. Ballé, and G. D. Toderici, “Joint autoregressive and hierarchical priors for learned image compression,” in Advances in Neural Information Processing Systems, vol. 31, 2018.
- W. N. Price and I. G. Cohen, “Privacy in the age of medical big data,” Nature Medicine, vol. 25, no. 1, pp. 37–43, 2019.
- J. Yoon, M. Mizrahi, N. F. Ghalaty, T. Jarvinen, A. S. Ravi, P. Brune et al., “Ehr-safe: generating high-fidelity and privacy-preserving synthetic electronic health records,” NPJ Digital Medicine, vol. 6, no. 1, p. 141, 2023.
- A. Mohanarathinam, S. Kamalraj, G. K. D. Prasanna Venkatesan, R. V. Ravi, and C. S. Manikandababu, “Digital watermarking techniques for image security: a review,” Journal of Ambient Intelligence and Humanized Computing, vol. 11, pp. 3221–3229, 2020.
- X. Wei, B. Pu, S. Zhao, C. Chi, and H. Fu, “Preventing unauthorized ai over-analysis by medical image adversarial watermarking,” 2023.
- M. Alfarra, J. C. Pérez, A. Thabet, A. Bibi, P. H. Torr, and B. Ghanem, “Combating adversaries with anti-adversaries,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 6, June 2022, pp. 5992–6000.
- Y. Yang et al., “A digital mask to safeguard patient privacy,” Nature Medicine, vol. 28, pp. 1883–1892, 2022.
- X. Guo, M. A. Khalid, I. Domingos, A. L. Michala, M. Adriko, C. Rowel et al., “Smartphone-based dna diagnostics for malaria detection using deep learning for local decision support and blockchain technology for security,” Nature Electronics, vol. 4, no. 8, pp. 615–624, 2021.
- J. C.-S. Cheung, “Vaccination: keep records secure with blockchain,” Nature, vol. 590, pp. 389–390, 2021.
- Q. Wang, F. Liu, Y. Zhang, J. Zhang, C. Gong, T. Liu, and B. Han, “Watermarking for out-of-distribution detection,” in Advances in Neural Information Processing Systems, vol. 35, 2022, pp. 15 545–15 557.
- H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, and M. Jordan, “Theoretically principled trade-off between robustness and accuracy,” in International Conference on Machine Learning. PMLR, May 2019, pp. 7472–7482.
- A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017.
- C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013.
- M. Ienca, “Don’t pause giant ai for the wrong reasons,” Nat. Mach. Intell., pp. 1–2, 2023.
- B. Smith, “Stop talking about tomorrow’s ai doomsday when ai poses risks today,” Nature, vol. 618, pp. 885–886, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.