CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection (2403.18554v2)
Abstract: Co-salient object detection (CoSOD) aims to identify the common and salient (usually in the foreground) regions across a given group of images. Although achieving significant progress, state-of-the-art CoSODs could be easily affected by some adversarial perturbations, leading to substantial accuracy reduction. The adversarial perturbations can mislead CoSODs but do not change the high-level semantic information (e.g., concept) of the co-salient objects. In this paper, we propose a novel robustness enhancement framework by first learning the concept of the co-salient objects based on the input group images and then leveraging this concept to purify adversarial perturbations, which are subsequently fed to CoSODs for robustness enhancement. Specifically, we propose CosalPure containing two modules, i.e., group-image concept learning and concept-guided diffusion purification. For the first module, we adopt a pre-trained text-to-image diffusion model to learn the concept of co-salient objects within group images where the learned concept is robust to adversarial examples. For the second module, we map the adversarial image to the latent space and then perform diffusion generation by embedding the learned concept into the noise prediction function as an extra condition. Our method can effectively alleviate the influence of the SOTA adversarial attack containing different adversarial patterns, including exposure and noise. The extensive results demonstrate that our method could enhance the robustness of CoSODs significantly.
- Frequency-tuned salient region detection. In 2009 IEEE conference on computer vision and pattern recognition, pages 1597–1604. IEEE, 2009.
- icoseg: Interactive co-segmentation with intelligent scribble guidance. In 2010 IEEE computer society conference on computer vision and pattern recognition, pages 3169–3176. IEEE, 2010.
- Self-adaptively weighted co-saliency detection via rank constraint. IEEE Transactions on Image Processing, 23(9):4175–4186, 2014.
- Learning continuous image representation with local implicit image function. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8628–8638, 2021.
- Salientshape: group saliency in image collections. The visual computer, 30:443–453, 2014a.
- Global contrast based salient region detection. IEEE transactions on pattern analysis and machine intelligence, 37(3):569–582, 2014b.
- Diffusion models in vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9185–9193, 2018.
- Taking a deeper look at co-salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2919–2929, 2020.
- Re-thinking co-salient object detection. IEEE transactions on pattern analysis and machine intelligence, 44(8):4339–4354, 2021.
- An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618, 2022.
- Back to the source: Diffusion-driven test-time adaptation. arXiv preprint arXiv:2207.03442, 2022a.
- Can you spot the chameleon? adversarially camouflaging images from co-salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2150–2159, 2022b.
- Co-saliency detection via inter and intra saliency propagation. Signal Processing: Image Communication, 44:69–83, 2016.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Watch out! motion is blurring the vision of your deep neural networks. Advances in Neural Information Processing Systems, 33:975–985, 2020.
- Benchmarking neural network robustness to common corruptions and perturbations. Proceedings of the International Conference on Learning Representations, 2019.
- Disco: Adversarial defense with local implicit functions. Advances in Neural Information Processing Systems, 35:23818–23837, 2022.
- Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
- A unified multiple graph learning and convolutional network model for co-saliency estimation. In Proceedings of the 27th ACM international conference on multimedia, pages 1375–1382, 2019.
- Adversarial examples in the physical world. In Artificial intelligence safety and security, pages 99–112. Chapman and Hall/CRC, 2018.
- Detecting robust co-saliency with recurrent co-attention neural network. In IJCAI, page 6, 2019.
- Deep contrast learning for salient object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 478–487, 2016.
- Repairing bad co-segmentation using its quality evaluation and segment propagation. IEEE Transactions on Image Processing, 23(8):3545–3559, 2014.
- A simple pooling-based design for real-time salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3917–3926, 2019.
- Sdedit: Image synthesis and editing with stochastic differential equations. arXiv preprint arXiv:2108.01073, 2021.
- Diffusion models for adversarial purification. arXiv preprint arXiv:2205.07460, 2022.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022.
- Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22500–22510, 2023.
- Distracting downpour: Adversarial weather attacks for motion estimation. ICCV, 2023.
- Laion-5b: An open large-scale dataset for training next generation image-text models. arXiv preprint arXiv:2210.08402, 2022.
- What the daam: Interpreting stable diffusion using cross attention. arXiv preprint arXiv:2210.04885, 2022.
- Guided diffusion model for adversarial purification. arXiv preprint arXiv:2205.14969, 2022.
- Saliency detection with recurrent fully convolutional networks. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, pages 825–841. Springer, 2016.
- Salient object detection in the deep learning era: An in-depth survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(6):3239–3259, 2021.
- Towards robust rain removal against adversarial attacks: A comprehensive benchmark analysis and beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6013–6022, 2022.
- Robust deep co-saliency detection with group semantic and pyramid attention. IEEE transactions on neural networks and learning systems, 31(7):2398–2408, 2020.
- Text-to-image diffusion model in generative ai: A survey. arXiv preprint arXiv:2303.07909, 2023.
- A self-paced multiple-instance learning framework for co-saliency detection. In Proceedings of the IEEE international conference on computer vision, pages 594–602, 2015.
- Detection of co-salient objects by looking deep and wide. International Journal of Computer Vision, 120:215–232, 2016.
- A review of co-saliency detection algorithms: Fundamentals, applications, and challenges. ACM Transactions on Intelligent Systems and Technology (TIST), 9(4):1–31, 2018.
- Adaptive graph convolutional network with attention graph clustering for co-saliency detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9050–9059, 2020a.
- Gradient-induced co-saliency detection. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XII 16, pages 455–472. Springer, 2020b.
- Egnet: Edge guidance network for salient object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 8779–8788, 2019.