FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models (2404.08631v1)

Published 12 Apr 2024 in cs.CR

Abstract: Few-shot classification with foundation models (e.g., CLIP, DINOv2, PaLM-2) enables users to build an accurate classifier with a few labeled training samples (called support samples) for a classification task. However, an attacker could perform data poisoning attacks by manipulating some support samples such that the classifier makes the attacker-desired, arbitrary prediction for a testing input. Empirical defenses cannot provide formal robustness guarantees, leading to a cat-and-mouse game between the attacker and defender. Existing certified defenses are designed for traditional supervised learning, resulting in sub-optimal performance when extended to few-shot classification. In our work, we propose FCert, the first certified defense against data poisoning attacks to few-shot classification. We show our FCert provably predicts the same label for a testing input under arbitrary data poisoning attacks when the total number of poisoned support samples is bounded. We perform extensive experiments on benchmark few-shot classification datasets with foundation models released by OpenAI, Meta, and Google in both vision and text domains. Our experimental results show our FCert: 1) maintains classification accuracy without attacks, 2) outperforms existing state-of-the-art certified defenses for data poisoning attacks, and 3) is efficient and general.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (85)

Authors (3)

Yanting Wang (25 papers)
Wei Zou (62 papers)
Jinyuan Jia (69 papers)

Citations (1)

View on Semantic Scholar

FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models (2404.08631v1)

Related Papers