Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models (2404.08631v1)

Published 12 Apr 2024 in cs.CR

Abstract: Few-shot classification with foundation models (e.g., CLIP, DINOv2, PaLM-2) enables users to build an accurate classifier with a few labeled training samples (called support samples) for a classification task. However, an attacker could perform data poisoning attacks by manipulating some support samples such that the classifier makes the attacker-desired, arbitrary prediction for a testing input. Empirical defenses cannot provide formal robustness guarantees, leading to a cat-and-mouse game between the attacker and defender. Existing certified defenses are designed for traditional supervised learning, resulting in sub-optimal performance when extended to few-shot classification. In our work, we propose FCert, the first certified defense against data poisoning attacks to few-shot classification. We show our FCert provably predicts the same label for a testing input under arbitrary data poisoning attacks when the total number of poisoned support samples is bounded. We perform extensive experiments on benchmark few-shot classification datasets with foundation models released by OpenAI, Meta, and Google in both vision and text domains. Our experimental results show our FCert: 1) maintains classification accuracy without attacks, 2) outperforms existing state-of-the-art certified defenses for data poisoning attacks, and 3) is efficient and general.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (85)
  1. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al., “Learning transferable visual models from natural language supervision,” in ICML, 2021.
  2. R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, et al., “On the opportunities and risks of foundation models,” arXiv, 2021.
  3. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, et al., “Segment anything,” arXiv, 2023.
  4. M. Oquab, T. Darcet, T. Moutakanni, H. Vo, M. Szafraniec, et al., “Dinov2: Learning robust visual features without supervision,” arXiv, 2023.
  5. J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for few-shot learning,” in NeurIPS, 2017.
  6. G. Alain and Y. Bengio, “Understanding intermediate layers using linear classifier probes,” in ICLR, 2017.
  7. S. X. Hu, D. Li, J. Stühmer, M. Kim, and T. M. Hospedales, “Pushing the limits of simple pipelines for few-shot learning: External data and fine-tuning make a difference,” in CVPR, 2022.
  8. Z. Xu, Z. Shi, J. Wei, Y. Li, and Y. Liang, “Improving foundation models for few-shot learning via multitask finetuning,” in ICLR Workshops, 2023.
  9. L. Muñoz-González, B. Biggio, A. Demontis, A. Paudice, V. Wongrassamee, E. C. Lupu, and F. Roli, “Towards poisoning of deep learning algorithms with back-gradient optimization,” in AISec, 2017.
  10. A. Shafahi, W. R. Huang, M. Najibi, O. Suciu, C. Studer, T. Dumitras, and T. Goldstein, “Poison frogs! targeted clean-label poisoning attacks on neural networks,” in NeurIPS, 2018.
  11. E. T. Oldewage, J. F. Bronskill, and R. E. Turner, “Adversarial attacks are a surprisingly strong baseline for poisoning few-shot meta-learners,” in NeurIPS Workshop, 2022.
  12. H. Xu, Y. Li, X. Liu, H. Liu, and J. Tang, “Yet meta learning can adapt fast, it can also break easily,” in SDM, 2021.
  13. X. Liu, X. Jia, J. Gu, Y. Xun, S. Liang, and X. Cao, “Does few-shot learning suffer from backdoor attacks?,” in AAAI, 2024.
  14. S. Puch, I. Sánchez, and M. Rowe, “Few-shot learning with deep triplet networks for brain imaging modality recognition,” in MICCAI 2019 workshop, 2019.
  15. Y. Ge, Y. Guo, Y.-C. Yang, M. A. Al-Garadi, and A. Sarker, “Few-shot learning for medical text: A systematic review,” arXiv, 2022.
  16. S. Zhou, C. Deng, Z. Piao, and B. Zhao, “Few-shot traffic sign recognition with clustering inductive bias and random neural network,” Pattern Recognition, vol. 100, p. 107160, 2020.
  17. M. Cantarini, L. Gabrielli, and S. Squartini, “Few-shot emergency siren detection,” Sensors, vol. 22, no. 12, p. 4338, 2022.
  18. S. Shan, A. N. Bhagoji, H. Zheng, and B. Y. Zhao, “Traceback of data poisoning attacks in neural networks,” in USENIX Security, 2022.
  19. J. Chen, X. Zhang, R. Zhang, C. Wang, and L. Liu, “De-pois: An attack-agnostic defense against data poisoning attacks,” in IEEE T-IFS, 2021.
  20. Y. Zeng, M. Pan, H. Jahagirdar, M. Jin, L. Lyu, and R. Jia, “Meta-sift: How to sift out a clean subset in the presence of data poisoning?,” in USENIX Security, 2023.
  21. N. Peri, N. Gupta, W. R. Huang, L. Fowl, C. Zhu, S. Feizi, T. Goldstein, and J. P. Dickerson, “Deep k-nn defense against clean-label data poisoning attacks,” in ECCV, 2020.
  22. P. W. Koh, J. Steinhardt, and P. Liang, “Stronger data poisoning attacks break data sanitization defenses,” Machine Learning, 2022.
  23. R. Shokri et al., “Bypassing backdoor detection algorithms in deep learning,” in Euro S & P, 2020.
  24. Y. Yang, T. Y. Liu, and B. Mirzasoleiman, “Not all poisons are created equal: Robust training against data poisoning,” in ICML, 2022.
  25. B. Tran, J. Li, and A. Madry, “Spectral signatures in backdoor attacks,” in NeurIPS, 2018.
  26. T. Y. Liu, Y. Yang, and B. Mirzasoleiman, “Friendly noise against adversarial noise: A powerful defense against data poisoning attacks,” arXiv, 2022.
  27. Y. Gao, C. Xu, D. Wang, S. Chen, D. C. Ranasinghe, and S. Nepal, “Strip: A defence against trojan attacks on deep neural networks,” in ACSAC, 2019.
  28. E. Chou, F. Tramer, and G. Pellegrino, “Sentinet: Detecting localized universal attacks against deep learning systems,” in IEEE S & P Workshops, 2020.
  29. B. Chen, W. Carvalho, N. Baracaldo, H. Ludwig, B. Edwards, T. Lee, I. Molloy, and B. Srivastava, “Detecting backdoor attacks on deep neural networks by activation clustering,” arXiv, 2018.
  30. K. Liu, B. Dolan-Gavitt, and S. Garg, “Fine-pruning: Defending against backdooring attacks on deep neural networks,” in RAID, 2018.
  31. H. Qiu, Y. Zeng, S. Guo, T. Zhang, M. Qiu, and B. Thuraisingham, “Deepsweep: An evaluation framework for mitigating dnn backdoor attacks using data augmentation,” in Asia CCS, 2021.
  32. B. Wang, Y. Yao, S. Shan, H. Li, B. Viswanath, H. Zheng, and B. Y. Zhao, “Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,” in IEEE S & P, 2019.
  33. X. Qiao, Y. Yang, and H. Li, “Defending neural backdoors via generative distribution modeling,” NeurIPS, 2019.
  34. J. Jia, X. Cao, and N. Z. Gong, “Intrinsic certified robustness of bagging against data poisoning attacks,” in AAAI, 2021.
  35. A. Levine and S. Feizi, “Deep partition aggregation: Provable defense against general poisoning attacks,” arXiv:2006.14768, 2020.
  36. J. Steinhardt, P. W. W. Koh, and P. S. Liang, “Certified defenses for data poisoning attacks,” in NeurIPS, 2017.
  37. W. Wang, A. J. Levine, and S. Feizi, “Improved certified defenses against data poisoning with (deterministic) finite aggregation,” in ICML, 2022.
  38. Y. Ma, X. Zhu, and J. Hsu, “Data poisoning against differentially-private learners: Attacks and defenses,” in IJCAI, 2019.
  39. E. Rosenfeld, E. Winston, P. Ravikumar, and J. Z. Kolter, “Certified robustness to label-flipping attacks via randomized smoothing,” in ICML, 2020.
  40. B. Wang, X. Cao, J. Jia, and N. Z. Gong, “On certifying robustness against backdoor attacks via randomized smoothing,” in CVPR Workshops, 2020.
  41. Y. Zhang, A. Albarghouthi, and L. D’Antoni, “Bagflip: A certified defense against data poisoning,” arXiv, 2022.
  42. K. Rezaei, K. Banihashem, A. Chegini, and S. Feizi, “Run-off election: Improved provable defense against data poisoning attacks,” arXiv preprint arXiv:2302.02300, 2023.
  43. M. Weber, X. Xu, B. Karlaš, C. Zhang, and B. Li, “Rab: Provable robustness against backdoor attacks,” in IEEE S & P, 2023.
  44. K. Chen, X. Lou, G. Xu, J. Li, and T. Zhang, “Clean-image backdoor: Attacking multi-label models with poisoned labels only,” in ICLR, 2023.
  45. E. Triantafillou, T. Zhu, V. Dumoulin, P. Lamblin, U. Evci, K. Xu, R. Goroshin, C. Gelada, K. Swersky, P.-A. Manzagol, et al., “Meta-dataset: A dataset of datasets for learning to learn from few examples,” arXiv preprint arXiv:1903.03096, 2019.
  46. L. Bertinetto, J. F. Henriques, P. H. Torr, and A. Vedaldi, “Meta-learning with differentiable closed-form solvers,” arXiv preprint arXiv:1805.08136, 2018.
  47. M. Ren, R. Liao, E. Fetaya, and R. Zemel, “Incremental few-shot learning with attention attractor networks,” in NeurIPS, 2019.
  48. W.-Y. Chen, Y.-C. Liu, Z. Kira, Y.-C. F. Wang, and J.-B. Huang, “A closer look at few-shot classification,” arXiv preprint arXiv:1904.04232, 2019.
  49. J. Jia, Y. Liu, X. Cao, and N. Z. Gong, “Certified robustness of nearest neighbors against data poisoning and backdoor attacks,” in AAAI, 2022.
  50. “PaLM-2 API.” https://developers.generativeai.google. Accessed: 2023-09-19.
  51. “OpenAI API.” https://openai.com/blog/openai-api. Accessed: 2023-09-19.
  52. X. Cao, J. Jia, and N. Z. Gong, “Provably secure federated learning against malicious clients,” in AAAI, 2021.
  53. Z. Zhang, J. Jia, B. Wang, and N. Z. Gong, “Backdoor attacks to graph neural networks,” in SACMAT, 2021.
  54. X. Cao, Z. Zhang, J. Jia, and N. Z. Gong, “Flcert: Provably secure federated learning against poisoning attacks,” IEEE Transactions on Information Forensics and Security, vol. 17, pp. 3691–3705, 2022.
  55. L. Li, T. Xie, and B. Li, “Sok: Certified robustness for deep neural networks,” in IEEE S & P, 2023.
  56. J. Jia, Y. Liu, Y. Hu, and N. Z. Gong, “Pore: Provably robust recommender systems against data poisoning attacks,” in USENIX Security Symposium, 2023.
  57. H. Pei, J. Jia, W. Guo, B. Li, and D. Song, “Textguard: Provable defense against backdoor attacks on text classification,” in NDSS, 2024.
  58. M. Lecuyer, V. Atlidakis, R. Geambasu, D. Hsu, and S. Jana, “Certified robustness to adversarial examples with differential privacy,” in IEEE S & P, 2019.
  59. J. Cohen, E. Rosenfeld, and Z. Kolter, “Certified adversarial robustness via randomized smoothing,” in ICML, 2019.
  60. M. Pautov, O. Kuznetsova, N. Tursynbek, A. Petiushko, and I. Oseledets, “Smoothed embeddings for certified few-shot learning,” in NeurIPS, 2022.
  61. F. Sung, Y. Yang, L. Zhang, T. Xiang, P. H. Torr, and T. M. Hospedales, “Learning to compare: Relation network for few-shot learning,” in CVPR, 2018.
  62. S. W. Yoon, J. Seo, and J. Moon, “Tapnet: Neural network augmented with task-adaptive projection for few-shot learning,” in ICML, 2019.
  63. T. Jeong and H. Kim, “Ood-maml: Meta-learning for few-shot out-of-distribution detection and classification,” NeurIPS, 2020.
  64. John Wiley & Sons, 2004.
  65. “Prototypical networks for few shot learning in pytorch.” https://github.com/orobix/Prototypical-Networks-for-Few-shot-Learning-PyTorch. Accessed: 2023-09-19.
  66. “CLIP-implementation.” https://github.com/openai/CLIP. Accessed: 2023-09-12.
  67. “sklearn.” https://scikit-learn.org/stable. Accessed: 2023-09-12.
  68. A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv, 2017.
  69. R. Anil, A. M. Dai, O. Firat, M. Johnson, D. Lepikhin, et al., “Palm 2 technical report,” arXiv, 2023.
  70. “20 newsgroups text dataset.” http://qwone.com/ jason/20Newsgroups/. Accessed: 2023-09-19.
  71. L. Shen, S. Ji, X. Zhang, J. Li, J. Chen, J. Shi, C. Fang, J. Yin, and T. Wang, “Backdoor pre-trained models can transfer to all,” in CCS, 2021.
  72. J. Jia, Y. Liu, and N. Z. Gong, “BadEncoder: Backdoor attacks to pre-trained encoders in self-supervised learning,” in IEEE S & P, 2022.
  73. N. Carlini and A. Terzis, “Poisoning and backdooring contrastive learning,” in ICLR, 2021.
  74. H. Liu, J. Jia, and N. Z. Gong, “PoisonedEncoder: Poisoning the unlabeled pre-training data in contrastive learning,” in USENIX Security, 2022.
  75. H. He, K. Zha, and D. Katabi, “Indiscriminate poisoning attacks on unsupervised contrastive learning,” in ICLR, 2022.
  76. C. Li, R. Pang, Z. Xi, T. Du, S. Ji, Y. Yao, and T. Wang, “An embarrassingly simple backdoor attack on self-supervised learning,” in ICCV, 2023.
  77. N. Carlini, M. Jagielski, C. A. Choquette-Choo, D. Paleka, W. Pearce, H. Anderson, A. Terzis, K. Thomas, and F. Tramèr, “Poisoning web-scale training datasets is practical,” arXiv, 2023.
  78. J. Zhang, H. Liu, J. Jia, and N. Z. Gong, “Corruptencoder: Data poisoning based backdoor attacks to contrastive learning,” in CVPR, 2024.
  79. J. Jia, H. Liu, and N. Z. Gong, “10 security and privacy problems in large foundation models,” in AI Embedded Assurance for Cyber Systems, pp. 139–159, 2023.
  80. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” in ICLR, 2014.
  81. I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv, 2014.
  82. T. Gu, B. Dolan-Gavitt, and S. Garg, “Badnets: Identifying vulnerabilities in the machine learning model supply chain,” arXiv preprint arXiv:1708.06733, 2017.
  83. Y. Liu, S. Ma, Y. Aafer, W.-C. Lee, J. Zhai, W. Wang, and X. Zhang, “Trojaning attack on neural networks,” in NDSS, 2018.
  84. D. Yin, Y. Chen, R. Kannan, and P. Bartlett, “Byzantine-robust distributed learning: Towards optimal statistical rates,” in ICML, 2018.
  85. “Outline for greedy algorithms: Exchange arguments.” https://www.cs.cornell.edu/courses/cs482/2007su/excha-nge.pdf. Accessed: 2023-10-01.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yanting Wang (25 papers)
  2. Wei Zou (62 papers)
  3. Jinyuan Jia (69 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.