DifAttack: Query-Efficient Black-Box Attack via Disentangled Feature Space (2309.14585v3)
Abstract: This work investigates efficient score-based black-box adversarial attacks with a high Attack Success Rate (ASR) and good generalizability. We design a novel attack method based on a Disentangled Feature space, called DifAttack, which differs significantly from the existing ones operating over the entire feature space. Specifically, DifAttack firstly disentangles an image's latent feature into an adversarial feature and a visual feature, where the former dominates the adversarial capability of an image, while the latter largely determines its visual appearance. We train an autoencoder for the disentanglement by using pairs of clean images and their Adversarial Examples (AEs) generated from available surrogate models via white-box attack methods. Eventually, DifAttack iteratively optimizes the adversarial feature according to the query feedback from the victim model until a successful AE is generated, while keeping the visual feature unaltered. In addition, due to the avoidance of using surrogate models' gradient information when optimizing AEs for black-box models, our proposed DifAttack inherently possesses better attack capability in the open-set scenario, where the training dataset of the victim model is unknown. Extensive experimental results demonstrate that our method achieves significant improvements in ASR and query efficiency simultaneously, especially in the targeted attack and open-set scenarios. The code is available at https://github.com/csjunjun/DifAttack.git.
- Sign Bits are All You Need for Black-Box Attacks. In Proc. Int. Conf. Learn. Representat.
- Deep Variational Information Bottleneck. In Proc. Int. Conf. Learn. Representat.
- Generative Adversarial Networks: A Comprehensive Review. Data Wrangling: Concepts, Applications and Tools, 213.
- Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. In Proc. Int. Conf. Learn. Representat.
- Blackbox Attacks via Surrogate Ensemble Search. In Proc. Int. Conf. Neural Inf. Process. Sys., volume 35, 5348–5362.
- Towards Evaluating the Robustness of Neural Networks. In Proc. IEEE Conf. Symp. Security Privacy., 39–57.
- Improving Black-Box Adversarial Attacks with a Transfer-Based Prior. In Proc. Int. Conf. Neural Inf. Process. Sys., volume 32.
- Minimally Distorted Adversarial Examples with a Fast Adaptive Boundary Attack. In Proc. Int. Conf. Mach. Learn., 2196–2205. PMLR.
- Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-Free Attacks. In Proc. Int. Conf. Mach. Learn., 2206–2216.
- Benchmarking Adversarial Robustness on Image Classification. In Proc. IEEE Conf. Comput. Vis. Pattern Recogn., 321–331.
- Boosting Adversarial Attacks with Momentum. In Proc. IEEE Conf. Comput. Vis. Pattern Recogn., 9185–9193.
- Query-efficient Meta Attack to Deep Neural Networks. In Proc. Int. Conf. Learn. Representat.
- Boosting Black-Box Attack with Partially Transferred Conditional Adversarial Distribution. In Proc. IEEE Conf. Comput. Vis. Pattern Recogn., 15095–15104.
- Generative Adversarial Nets. In Proc. Int. Conf. Neural Inf. Process. Sys., volume 27.
- Google. 2023. Google Cloud Vision API. https://cloud.google.com/vision. Accessed:2023-08-15.
- Simple Black-Box Adversarial Attacks. In Proc. Int. Conf. Mach. Learn., 2484–2493.
- Countering Adversarial Images using Input Transformations. In Proc. Int. Conf. Learn. Representat.
- Subspace Attack: Exploiting Promising Subspaces for Query-Efficient Black-Box Attacks. In Proc. Int. Conf. Neural Inf. Process. Sys., volume 32.
- Autoencoders, Minimum Description Length and Helmholtz Free Energy. In Proc. Int. Conf. Neural Inf. Process. Sys., volume 6.
- Black-Box Adversarial Attack with Transferable Model-based Embedding. In Proc. Int. Conf. Learn. Representat.
- Black-box Adversarial Attacks with Limited Queries and Information. In Proc. Int. Conf. Mach. Learn., 2137–2146.
- Imagga. 2023. AI-Powered Image Tagging API. ”https://imagga.com/solutions/auto-tagging”. Accessed:2023-08-15.
- Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck. In Proc. Int. Conf. Neural Inf. Process. Sys., volume 34, 17148–17159.
- A Comprehensive Survey on Design and Application of Autoencoder in Deep Learning. Applied Soft Computing, 110176.
- Nattack: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks. In Proc. Int. Conf. Mach. Learn., 3866–3876.
- Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation. IEEE Trans. on Pattern Anal. and Mach. Intell., 43(4): 1254–1266.
- Attacking Deep Networks with Surrogate-Based Adversarial Black-Box Methods is Easy. In Proc. Int. Conf. Learn. Representat.
- Discriminator-Free Generative Adversarial Attack. In Proc. ACM Int. Conf. Multimedia., 1544–1552.
- Towards Deep Learning Models Resistant to Adversarial Attacks. In Proc. Int. Conf. Learn. Representat.
- Advflow: Inconspicuous Black-Box Adversarial Attacks using Normalizing Flows. In Proc. Int. Conf. Neural Inf. Process. Sys., 15871–15884.
- Deeply Supervised Discriminative Learning for Adversarial Defense. IEEE Trans. on Pattern Anal. and Mach. Intell., 43(9): 3154–3166.
- A Self-Supervised Approach for Adversarial Robustness. In Proc. IEEE Conf. Comput. Vis. Pattern Recogn., 262–271.
- Diffusion Models for Adversarial Purification. In Proc. Int. Conf. Mach. Learn.
- Training Meta-Surrogate Model for Transferable Adversarial Attack. In Proc. of the AAAI Conf. on Artif. Intell., volume 37, 9516–9524.
- Do Adversarially Robust ImageNet Models Transfer Better? arXiv:2007.08489.
- Open-Set Recognition: A Good Closed-Set Classifier is All You Need. In Proc. Int. Conf. Learn. Representat.
- Cross-domain Face Presentation Attack Detection via Multi-Domain Disentangled Representation Learning. In Proc. IEEE Conf. Comput. Vis. Pattern Recogn., 6678–6687.
- Defending Adversarial Attacks via Semantic Feature Manipulation. IEEE Trans. on Serv. Comput., 15(6): 3184–3197.
- Weiai, C. 2023. Pytorch-cifar100. https://github.com/weiaicunzai/pytorch-cifar100. Accessed:2023-08-15.
- Natural Evolution Strategies. The Journal of Mach. Learn. Resear., 15(1): 949–980.
- Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. In Proc. Netw. Distrib. Syst. Secur. Symp.
- Adversarial Robustness through Disentangled Representations. In Proc. of the AAAI Conf. on Artif. Intell., volume 35, 3145–3153.
- Generalizable Black-Box Adversarial Attack With Meta Learning. IEEE Trans. on Pattern Anal. and Mach. Intell., 1–13.
- Generalizable Black-Box Adversarial Attack with Meta Learning. IEEE Trans. on Pattern Anal. and Mach. Intell.
- Walking on the Edge: Fast, Low-Distortion Adversarial Examples. IEEE Trans. Inf. Forensics and Security, 16: 701–713.
- Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proc. IEEE Conf. Comput. Vis.
- Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification. In Euro. Conf. on Comput. Vis., 87–104.