Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning (2401.00653v1)

Published 1 Jan 2024 in cs.CV

Abstract: Deceptive images can be shared in seconds with social networking services, posing substantial risks. Tampering traces, such as boundary artifacts and high-frequency information, have been significantly emphasized by massive networks in the Image Manipulation Localization (IML) field. However, they are prone to image post-processing operations, which limit the generalization and robustness of existing methods. We present a novel Prompt-IML framework. We observe that humans tend to discern the authenticity of an image based on both semantic and high-frequency information, inspired by which, the proposed framework leverages rich semantic knowledge from pre-trained visual foundation models to assist IML. We are the first to design a framework that utilizes visual foundation models specially for the IML task. Moreover, we design a Feature Alignment and Fusion module to align and fuse features of semantic features with high-frequency features, which aims at locating tampered regions from multiple perspectives. Experimental results demonstrate that our model can achieve better performance on eight typical fake image datasets and outstanding robustness.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. “Image splicing localization using a multi-task fully convolutional network (mfcn),” Journal of Visual Communication and Image Representation, vol. 51, pp. 201–209, 2018.
  2. “Rru-net: The ringed residual u-net for image splicing forgery detection,” in Proceedings of the IEEE/CVF Conference on CVPR Workshops, 2019, pp. 0–0.
  3. “ManTra-Net: Manipulation tracing network for detection and localization of image forgeries with anomalous features,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 9543–9552.
  4. “Localization of deep inpainting using high-pass fully convolutional network,” in proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 8301–8310.
  5. “Hybrid lstm and encoder–decoder architecture for detection of image forgeries,” IEEE Transactions on Image Processing, vol. 28, no. 7, pp. 3286–3300, 2019.
  6. “Span: Spatial pyramid attention network for image manipulation localization,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16. Springer, 2020, pp. 312–328.
  7. “Pscc-net: Progressive spatio-channel correlation network for image manipulation detection and localization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, pp. 7505–7517, 2022.
  8. “Mvss-net: Multi-view multi-scale supervised networks for image manipulation detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3539–3553, 2022.
  9. “Learning jpeg compression artifacts for image manipulation detection and localization,” International Journal of Computer Vision, vol. 130, no. 8, pp. 1875–1895, 2022.
  10. “Explicit visual prompting for low-level structure segmentations,” in Proceedings of the IEEE/CVF Conference on CVPR, 2023, pp. 19434–19445.
  11. “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on CVPR, 2015, pp. 3431–3440.
  12. “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834–848, 2017.
  13. “Pixel-inconsistency modeling for image manipulation localization,” arXiv preprint arXiv:2310.00234, 2023.
  14. “Casia image tampering detection evaluation database,” in 2013 IEEE China summit and international conference on signal and information processing. IEEE, 2013, pp. 422–426.
  15. “Columbia image splicing detection evaluation dataset,” DVMM lab. Columbia Univ CalPhotos Digit Libr, 2009.
  16. “Coverage—a novel database for copy-move forgery detection,” in 2016 IEEE international conference on image processing (ICIP). IEEE, 2016, pp. 161–165.
  17. “Mfc datasets: Large-scale benchmark datasets for media forensic challenge evaluation,” in 2019 IEEE Winter Applications of Computer Vision Workshops. IEEE, 2019, pp. 63–72.
  18. “Evaluation of random field models in multi-modal unsupervised tampering localization,” in 2016 IEEE international workshop on information forensics and security (WIFS). IEEE, 2016, pp. 1–6.
  19. “Fighting fake news: Image splice detection via learned self-consistency,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 101–117.
  20. “Defacto: Image and face manipulation dataset,” in 2019 27Th european signal processing conference (EUSIPCO). IEEE, 2019, pp. 1–5.
  21. “Imd2020: A large-scale annotated dataset tailored for detecting manipulated images,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2020, pp. 71–80.
  22. “Robust image forgery detection over online social network shared images,” in Proceedings of the IEEE/CVF Conference on CVPR, 2022, pp. 13440–13449.
  23. “Masked-attention mask transformer for universal image segmentation,” in Proceedings of the IEEE/CVF conference on CVPR, 2022, pp. 1290–1299.
  24. “Masked autoencoders are scalable vision learners,” in Proceedings of the IEEE/CVF conference on CVPR, 2022, pp. 16000–16009.
  25. “Cmx: Cross-modal fusion for rgb-x semantic segmentation with transformers,” IEEE Transactions on Intelligent Transportation Systems, 2023.
  26. “Visual prompt tuning,” in European Conference on Computer Vision, 2022, pp. 709–727.
  27. “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10012–10022.
  28. “Deformable detr: Deformable transformers for end-to-end object detection,” arXiv preprint arXiv:2010.04159, 2020.
  29. “Learning to immunize images for tamper localization and self-recovery,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  30. “Draw: Defending camera-shooted raw against image manipulation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 22434–22444.
Citations (1)

Summary

We haven't generated a summary for this paper yet.