Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CIMGEN: Controlled Image Manipulation by Finetuning Pretrained Generative Models on Limited Data (2401.13006v1)

Published 23 Jan 2024 in cs.AI, cs.LG, and eess.IV

Abstract: Content creation and image editing can benefit from flexible user controls. A common intermediate representation for conditional image generation is a semantic map, that has information of objects present in the image. When compared to raw RGB pixels, the modification of semantic map is much easier. One can take a semantic map and easily modify the map to selectively insert, remove, or replace objects in the map. The method proposed in this paper takes in the modified semantic map and alter the original image in accordance to the modified map. The method leverages traditional pre-trained image-to-image translation GANs, such as CycleGAN or Pix2Pix GAN, that are fine-tuned on a limited dataset of reference images associated with the semantic maps. We discuss the qualitative and quantitative performance of our technique to illustrate its capacity and possible applications in the fields of image forgery and image editing. We also demonstrate the effectiveness of the proposed image forgery technique in thwarting the numerous deep learning-based image forensic techniques, highlighting the urgent need to develop robust and generalizable image forensic tools in the fight against the spread of fake media.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Demystifying mmd gans. 6th International Conference on Learning Representations, 2018.
  2. Adversarially optimized mixup for robust classification. arXiv preprint arXiv:2103.11589, 2021.
  3. The cityscapes dataset. In CVPR Workshop on the Future of Datasets in Vision, volume 2. sn, 2015.
  4. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
  5. Priya Dialani. Do you know deefake geography and its impact? https://www.globaltechoutlook.com/do-you-know-deefake-geography-and-its-impact/, (accessed May 25, 2022).
  6. Moesr: Blind super-resolution using kernel-aware mixture of experts. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3408–3417, 2022.
  7. Insetgan for full-body image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7723–7732, 2022.
  8. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  9. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  10. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017.
  11. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4681–4690, 2017.
  12. Collaging class-specific gans for semantic image synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14418–14427, 2021.
  13. Pd-gan: Probabilistic diverse gan for image inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9371–9381, June 2021.
  14. Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark. In IEEE International Geoscience and Remote Sensing Symposium (IGARSS). IEEE, 2017.
  15. edge-sr: Super-resolution for the masses. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 1078–1087, January 2022.
  16. {{\{{CT-GAN}}\}}: Malicious tampering of 3d medical imagery using deep learning. In 28th USENIX Security Symposium (USENIX Security 19), pages 461–478, 2019.
  17. Barrage of random transforms for adversarially robust defense. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6528–6537, 2019.
  18. Zero-shot text-to-image generation. In International Conference on Machine Learning, pages 8821–8831. PMLR, 2021.
  19. DM RijulGupta. Deepmedia. 2022 deepfake satellite images - a github repository. https://github.com/RijulGupta-DM/deepfake-satellite-images/.
  20. Scikit-Image. Structural similarity index. https://scikit-image.org/docs/dev/auto_examples/transform/plot_ssim.html, (accessed Jan 07, 2022).
  21. James Vincent. Deepfake satellite imagery poses a not-so-distant threat, warn geographers. https://www.theverge.com/2021/4/27/22403741/deepfake-geography-satellite-imagery-ai-generated-fakes-threat, (accessed June 22, 2022).
  22. High-resolution image synthesis and semantic manipulation with conditional gans. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8798–8807, 2018.
  23. Dual-path image inpainting with auxiliary gan inversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11421–11430, June 2022.
  24. Giraffe hd: A high-resolution 3d-aware generative model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18440–18449, 2022.
  25. Generative image inpainting with contextual attention. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5505–5514, 2018.
  26. Deep fake geography? when geospatial data encounter artificial intelligence. Cartography and Geographic Information Science, 48(4):338–352, 2021.
  27. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com