PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning (2401.00653v1)
Abstract: Deceptive images can be shared in seconds with social networking services, posing substantial risks. Tampering traces, such as boundary artifacts and high-frequency information, have been significantly emphasized by massive networks in the Image Manipulation Localization (IML) field. However, they are prone to image post-processing operations, which limit the generalization and robustness of existing methods. We present a novel Prompt-IML framework. We observe that humans tend to discern the authenticity of an image based on both semantic and high-frequency information, inspired by which, the proposed framework leverages rich semantic knowledge from pre-trained visual foundation models to assist IML. We are the first to design a framework that utilizes visual foundation models specially for the IML task. Moreover, we design a Feature Alignment and Fusion module to align and fuse features of semantic features with high-frequency features, which aims at locating tampered regions from multiple perspectives. Experimental results demonstrate that our model can achieve better performance on eight typical fake image datasets and outstanding robustness.
- “Image splicing localization using a multi-task fully convolutional network (mfcn),” Journal of Visual Communication and Image Representation, vol. 51, pp. 201–209, 2018.
- “Rru-net: The ringed residual u-net for image splicing forgery detection,” in Proceedings of the IEEE/CVF Conference on CVPR Workshops, 2019, pp. 0–0.
- “ManTra-Net: Manipulation tracing network for detection and localization of image forgeries with anomalous features,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 9543–9552.
- “Localization of deep inpainting using high-pass fully convolutional network,” in proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 8301–8310.
- “Hybrid lstm and encoder–decoder architecture for detection of image forgeries,” IEEE Transactions on Image Processing, vol. 28, no. 7, pp. 3286–3300, 2019.
- “Span: Spatial pyramid attention network for image manipulation localization,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16. Springer, 2020, pp. 312–328.
- “Pscc-net: Progressive spatio-channel correlation network for image manipulation detection and localization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, pp. 7505–7517, 2022.
- “Mvss-net: Multi-view multi-scale supervised networks for image manipulation detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3539–3553, 2022.
- “Learning jpeg compression artifacts for image manipulation detection and localization,” International Journal of Computer Vision, vol. 130, no. 8, pp. 1875–1895, 2022.
- “Explicit visual prompting for low-level structure segmentations,” in Proceedings of the IEEE/CVF Conference on CVPR, 2023, pp. 19434–19445.
- “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on CVPR, 2015, pp. 3431–3440.
- “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834–848, 2017.
- “Pixel-inconsistency modeling for image manipulation localization,” arXiv preprint arXiv:2310.00234, 2023.
- “Casia image tampering detection evaluation database,” in 2013 IEEE China summit and international conference on signal and information processing. IEEE, 2013, pp. 422–426.
- “Columbia image splicing detection evaluation dataset,” DVMM lab. Columbia Univ CalPhotos Digit Libr, 2009.
- “Coverage—a novel database for copy-move forgery detection,” in 2016 IEEE international conference on image processing (ICIP). IEEE, 2016, pp. 161–165.
- “Mfc datasets: Large-scale benchmark datasets for media forensic challenge evaluation,” in 2019 IEEE Winter Applications of Computer Vision Workshops. IEEE, 2019, pp. 63–72.
- “Evaluation of random field models in multi-modal unsupervised tampering localization,” in 2016 IEEE international workshop on information forensics and security (WIFS). IEEE, 2016, pp. 1–6.
- “Fighting fake news: Image splice detection via learned self-consistency,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 101–117.
- “Defacto: Image and face manipulation dataset,” in 2019 27Th european signal processing conference (EUSIPCO). IEEE, 2019, pp. 1–5.
- “Imd2020: A large-scale annotated dataset tailored for detecting manipulated images,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2020, pp. 71–80.
- “Robust image forgery detection over online social network shared images,” in Proceedings of the IEEE/CVF Conference on CVPR, 2022, pp. 13440–13449.
- “Masked-attention mask transformer for universal image segmentation,” in Proceedings of the IEEE/CVF conference on CVPR, 2022, pp. 1290–1299.
- “Masked autoencoders are scalable vision learners,” in Proceedings of the IEEE/CVF conference on CVPR, 2022, pp. 16000–16009.
- “Cmx: Cross-modal fusion for rgb-x semantic segmentation with transformers,” IEEE Transactions on Intelligent Transportation Systems, 2023.
- “Visual prompt tuning,” in European Conference on Computer Vision, 2022, pp. 709–727.
- “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10012–10022.
- “Deformable detr: Deformable transformers for end-to-end object detection,” arXiv preprint arXiv:2010.04159, 2020.
- “Learning to immunize images for tamper localization and self-recovery,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- “Draw: Defending camera-shooted raw against image manipulation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 22434–22444.