Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Attack Deterministic Conditional Image Generative Models for Diverse and Controllable Generation (2403.08294v1)

Published 13 Mar 2024 in cs.CV

Abstract: Existing generative adversarial network (GAN) based conditional image generative models typically produce fixed output for the same conditional input, which is unreasonable for highly subjective tasks, such as large-mask image inpainting or style transfer. On the other hand, GAN-based diverse image generative methods require retraining/fine-tuning the network or designing complex noise injection functions, which is computationally expensive, task-specific, or struggle to generate high-quality results. Given that many deterministic conditional image generative models have been able to produce high-quality yet fixed results, we raise an intriguing question: is it possible for pre-trained deterministic conditional image generative models to generate diverse results without changing network structures or parameters? To answer this question, we re-examine the conditional image generation tasks from the perspective of adversarial attack and propose a simple and efficient plug-in projected gradient descent (PGD) like method for diverse and controllable image generation. The key idea is attacking the pre-trained deterministic generative models by adding a micro perturbation to the input condition. In this way, diverse results can be generated without any adjustment of network structures or fine-tuning of the pre-trained models. In addition, we can also control the diverse results to be generated by specifying the attack direction according to a reference text or image. Our work opens the door to applying adversarial attack to low-level vision tasks, and experiments on various conditional image generation tasks demonstrate the effectiveness and superiority of the proposed method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Synthetic and natural noise both break neural machine translation. arXiv preprint arXiv:1711.02173.
  2. Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096.
  3. Hopskipjumpattack: A query-efficient decision-based attack. In 2020 ieee symposium on security and privacy (sp), 1277–1294. IEEE.
  4. PSD: Principled synthetic-to-real dehazing guided by physical priors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 7180–7189.
  5. User-controllable arbitrary style transfer via entropy regularization.
  6. Chiu, T.-Y. 2019. Understanding generalized whitening and coloring transform for universal style transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 4452–4460.
  7. StyTr^ 2: Unbiased Image Style Transfer with Transformers. arXiv preprint arXiv:2105.14576.
  8. Compression artifacts reduction by a deep convolutional network. In Proceedings of the IEEE international conference on computer vision, 576–584.
  9. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 12873–12883.
  10. StyleGAN-NADA: CLIP-guided domain adaptation of image generators. ACM Transactions on Graphics (TOG), 41(4): 1–13.
  11. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2414–2423.
  12. Generative adversarial networks. Communications of the ACM, 63(11): 139–144.
  13. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.
  14. Hertzmann, A. 2003. A survey of stroke-based rendering. Institute of Electrical and Electronics Engineers.
  15. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1125–1134.
  16. Perceptual losses for real-time style transfer and super-resolution. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, 694–711. Springer.
  17. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 4401–4410.
  18. Clipstyler: Image style transfer with a single text condition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18062–18071.
  19. Mat: Mask-aware transformer for large hole image inpainting. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10758–10768.
  20. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision, 1833–1844.
  21. Reduce information loss in transformers for pluralistic image inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11347–11357.
  22. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083.
  23. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784.
  24. Edgeconnect: Generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212.
  25. The limitations of deep learning in adversarial settings. In 2016 IEEE European symposium on security and privacy (EuroS&P), 372–387. IEEE.
  26. Learning transferable visual models from natural language supervision. In International conference on machine learning, 8748–8763. PMLR.
  27. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10684–10695.
  28. A style-aware content loss for real-time hd style transfer. In proceedings of the European conference on computer vision (ECCV), 698–714.
  29. Resolution-robust large mask inpainting with fourier convolutions. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2149–2159.
  30. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
  31. Eigenfaces for recognition. Journal of cognitive neuroscience, 3(1): 71–86.
  32. High-fidelity pluralistic image completion with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 4692–4701.
  33. Diversified arbitrary style transfer via deep feature perturbation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7789–7798.
  34. DivSwapper: towards diversified patch-based arbitrary style transfer. arXiv preprint arXiv:2101.06381.
  35. Fast is better than free: Revisiting adversarial training. arXiv preprint arXiv:2001.03994.
  36. Characterizing adversarial examples based on spatial consistency information for semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), 217–234.
  37. Image denoising and inpainting with deep neural networks. Advances in neural information processing systems, 25.
  38. Learning to incorporate structure knowledge for image inpainting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 12605–12612.
  39. Free-form image inpainting with gated convolution. In Proceedings of the IEEE/CVF international conference on computer vision, 4471–4480.
  40. Diverse image inpainting with bidirectional and autoregressive transformers. In Proceedings of the 29th ACM International Conference on Multimedia, 69–78.
  41. Multimodal image synthesis and editing: A survey. arXiv preprint arXiv:2112.13592.
  42. Large scale image completion via co-modulated generative adversarial networks. arXiv preprint arXiv:2103.10428.
  43. Pluralistic image completion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1438–1447.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Tianyi Chu (11 papers)
  2. Wei Xing (34 papers)
  3. Jiafu Chen (5 papers)
  4. Zhizhong Wang (14 papers)
  5. Jiakai Sun (8 papers)
  6. Lei Zhao (808 papers)
  7. Haibo Chen (93 papers)
  8. Huaizhong Lin (7 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com