Unsegment Anything by Simulating Deformation (2404.02585v1)
Abstract: Foundation segmentation models, while powerful, pose a significant risk: they enable users to effortlessly extract any objects from any digital content with a single click, potentially leading to copyright infringement or malicious misuse. To mitigate this risk, we introduce a new task "Anything Unsegmentable" to grant any image "the right to be unsegmented". The ambitious pursuit of the task is to achieve highly transferable adversarial attacks against all prompt-based segmentation models, regardless of model parameterizations and prompts. We highlight the non-transferable and heterogeneous nature of prompt-specific adversarial noises. Our approach focuses on disrupting image encoder features to achieve prompt-agnostic attacks. Intriguingly, targeted feature attacks exhibit better transferability compared to untargeted ones, suggesting the optimal update direction aligns with the image manifold. Based on the observations, we design a novel attack named Unsegment Anything by Simulating Deformation (UAD). Our attack optimizes a differentiable deformation function to create a target deformed image, which alters structural information while preserving achievable feature distance by adversarial example. Extensive experiments verify the effectiveness of our approach, compromising a variety of promptable segmentation models with different architectures and prompt interfaces. We release the code at https://github.com/jiahaolu97/anything-unsegmentable.
- On the robustness of semantic segmentation models to adversarial attacks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 888–897, 2018.
- Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848, 2017.
- Focalclick: Towards practical interactive image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1300–1309, 2022a.
- Semantically stealthy adversarial attacks against segmentation models. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 4080–4089, 2022b.
- Open-vocabulary panoptic segmentation with maskclip. arXiv preprint arXiv:2208.08984, 2022.
- Boosting adversarial attacks with momentum. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
- Evading defenses to transferable adversarial examples by translation-invariant attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4312–4321, 2019.
- Fda: Feature disruptive attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8069–8079, 2019.
- Editanything: Empowering unparalleled flexibility in image editing and generation. In Proceedings of the 31st ACM International Conference on Multimedia, Demo track, 2023.
- Scaling open-vocabulary image segmentation with image-level labels. In European Conference on Computer Vision, pages 540–557. Springer, 2022.
- Segpgd: An effective and efficient adversarial attack for evaluating and boosting segmentation robustness. In European Conference on Computer Vision, pages 308–325. Springer, 2022.
- Segment anything meets universal adversarial perturbation. arXiv preprint arXiv:2310.12431, 2023.
- Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
- Universal adversarial perturbations against semantic image segmentation. In Proceedings of the IEEE international conference on computer vision, pages 2755–2764, 2017.
- Deep feature transfer between localization and segmentation tasks. arXiv preprint arXiv:1811.02539, 2018.
- Enhancing adversarial example transferability with an intermediate level attack. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4733–4742, 2019.
- On the robustness of segment anything. arXiv preprint arXiv:2305.16220, 2023a.
- On the robustness of segment anything. arXiv preprint arXiv:2305.16220, 2023b.
- Feature space perturbations yield more transferable adversarial examples. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Spatial transformer networks. Advances in neural information processing systems, 28, 2015.
- Segment anything in high quality. In NeurIPS, 2023.
- Panoptic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9404–9413, 2019.
- Segment anything. arXiv preprint arXiv:2304.02643, 2023.
- Similarity of neural network representations revisited. In International conference on machine learning, pages 3519–3529. PMLR, 2019.
- Semantic-sam: Segment and recognize anything at any granularity. arXiv preprint arXiv:2307.04767, 2023a.
- Improving adversarial transferability by intermediate-level perturbation decay. In NeurIPS, 2023b.
- Discrete point-wise attack is not enough: Generalized manifold adversarial attack for face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20575–20584, 2023c.
- Learning transferable adversarial examples via ghost networks. Proceedings of the AAAI Conference on Artificial Intelligence, page 11458–11465, 2020.
- Nesterov accelerated gradient and scale invariance for adversarial attacks. In International Conference on Learning Representations, 2020.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
- Simpleclick: Interactive image segmentation with simple vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 22290–22300, 2023.
- Delving into transferable adversarial examples and black-box attacks. International Conference on Learning Representations,International Conference on Learning Representations, 2016.
- Frequency domain model augmentation for adversarial attack. ECCV 2022 Oral, 2022.
- Robustness of sam: Segment anything under corruptions and beyond. arXiv preprint arXiv:2306.07713, 2023a.
- Robustness of sam: Segment anything under corruptions and beyond. arXiv preprint arXiv:2306.07713, 2023b.
- Shunted self-attention via multi-scale token aggregation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Sg-former: Self-guided transformer with evolving token reallocation. In IEEE/CVF International Conference on Computer Vision, 2023.
- Anything-3d: Towards single-view anything reconstruction in the wild. arXiv preprint arXiv:2304.10261, 2023.
- Enhancing the transferability of adversarial attacks through variance tuning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1924–1933, 2021.
- Admix: Enhancing the transferability of adversarial attacks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16158–16167, 2021a.
- Boosting adversarial transferability through enhanced momentum. arXiv preprint arXiv:2103.10609, 2021b.
- Seggpt: Segmenting everything in context. arXiv preprint arXiv:2304.03284, 2023a.
- An empirical study on the robustness of the segment anything model (sam). arXiv preprint arXiv:2305.06422, 2023b.
- Feature importance-aware transferable adversarial attacks. In Proceedings of the IEEE/CVF international conference on computer vision, pages 7639–7648, 2021c.
- Boosting the transferability of adversarial samples via attention. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Adversarial examples for semantic segmentation and object detection. In Proceedings of the IEEE international conference on computer vision, pages 1369–1378, 2017.
- Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2730–2739, 2019.
- Groupvit: Semantic segmentation emerges from text supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18134–18144, 2022.
- Open-vocabulary panoptic segmentation with text-to-image diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2955–2966, 2023.
- Deep interactive object selection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Deep model reassembly. Advances in neural information processing systems, 35:25739–25753, 2022.
- Mutual-modality adversarial attack with semantic perturbation. In AAAI Conference on Artificial Intelligence, 2024.
- How transferable are features in deep neural networks? Advances in neural information processing systems, 27, 2014.
- Inpaint anything: Segment anything meets image inpainting. arXiv preprint arXiv:2304.06790, 2023.
- Metaformer baselines for vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
- Attack-sam: Towards evaluating adversarial robustness of segment anything model. arXiv preprint arXiv:2305.00866, 2023.
- Fast segment anything. arXiv preprint arXiv:2306.12156, 2023a.
- Revisiting transferable adversarial image examples: Attack categorization, evaluation guidelines, and new insights. arXiv preprint arXiv:2310.11850, 2023b.
- Black-box targeted adversarial attack on segment anything (sam). arXiv preprint arXiv:2310.10010, 2023.
- Transferable adversarial perturbations. In Proceedings of the European Conference on Computer Vision (ECCV), pages 452–467, 2018.
- Segment everything everywhere all at once. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.