TBFormer: Two-Branch Transformer for Image Forgery Localization (2302.13004v1)
Abstract: Image forgery localization aims to identify forged regions by capturing subtle traces from high-quality discriminative features. In this paper, we propose a Transformer-style network with two feature extraction branches for image forgery localization, and it is named as Two-Branch Transformer (TBFormer). Firstly, two feature extraction branches are elaborately designed, taking advantage of the discriminative stacked Transformer layers, for both RGB and noise domain features. Secondly, an Attention-aware Hierarchical-feature Fusion Module (AHFM) is proposed to effectively fuse hierarchical features from two different domains. Although the two feature extraction branches have the same architecture, their features have significant differences since they are extracted from different domains. We adopt position attention to embed them into a unified feature domain for hierarchical feature investigation. Finally, a Transformer decoder is constructed for feature reconstruction to generate the predicted mask. Extensive experiments on publicly available datasets demonstrate the effectiveness of the proposed model.
- Y. Liu, X. Zhu, X. Zhao, and Y. Cao, “Adversarial learning for constrained image splicing detection and localization based on atrous convolution,” IEEE Trans. Inf. Forensics Secur., vol. 14, no. 10, pp. 2551-2566, 2019.
- Y. Liu, C. Xia, X. Zhu, and S. Xu, “Two-stage copy-move forgery detection with self deep matching and proposal superglue,” IEEE Trans. Image Process., vol. 31, pp. 541-555, 2022.
- D. Cozzolino, G. Poggi, and L. Verdoliva, “Splicebuster: A new blind image splicing detector,” in Proc. IEEE Int. Workshop Inf. Forensics Secur., 2015, pp. 1-6.
- Y. Wu, W. Abd-Almageed, and P. Natarajan, “Deep matching and validation network: An end-to-end solution to constrained image splicing localization and detection,” in Proc. 25th ACM Int. Conf. Multimedia, 2017, pp. 1480-1502.
- B. Liu and C.-M. Pun, “Deep fusion network for splicing forgery localization,” in Proc. Eur. Conf. Comput. Vis. Workshops, 2018, pp. 237-251.
- Y. Liu, Q. Guan, X. Zhao, and Y. Cao, “Image forgery localization based on multi-scale convolutional neural networks,” in Proc. 6th ACM Workshop Inf. Hiding Multimedia Secur., Innsbruck, Austria, 2018, pp. 85-90.
- B. Liu and C.-M. Pun, “Exposing splicing forgery in realistic scenes using deep fusion network,” Inf. Sci., vol. 526, pp. 133-150, 2020.
- D. Cozzolino, G. Poggi, and L. Verdoliva, “Efficient dense-field copy-move forgery detection,” IEEE Trans. Inf. Forensics Secur., vol. 10, no. 11, pp. 2284-2297, 2015.
- Y. Wu, W. Abd-Almageed, and P. Natarajan, “Busternet: Detecting copy-move image forgery with source/target localization,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 168-184.
- J. Zhong and C.-M. Pun, “An end-to-end dense-inceptionnet for image copy-move forgery detection,” IEEE Trans. Inf. Forensics Secur., vol. 15, pp. 2134-2146, 2020.
- A. Islam, C. Long, A. Basharat, and A. Hoogs, “DOA-GAN: dual-order attentive generative adversarial network for image copy-move forgery detection and localization,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Seattle, WA, USA, 2020, pp. 4675-4684.
- M. Barni, Q.-T. Phan, and B. Tondi, “Copy move source-target disambiguation through multi-branch cnns,” IEEE Trans. Inf. Forensics Secur., vol. 16, pp. 1825-1840, 2021.
- X. Zhu, Y. Qian, X. Zhao X, B. Sun, and Y. Sun, “A deep learning approach to patch-based image inpainting forensics,” Signal Process. Image Commun., vol. 67, pp. 90-99, 2018.
- H. Li and J. Huang, “Localization of deep inpainting using high-pass fully convolutional network,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 8301-8310.
- X. Hu, Z. Zhang, Z. Jiang, S. Chaudhuri, Z. Yang, and R. Nevatia, “SPAN: Spatial pyramid attention network for image manipulation localization,” in Proc. Eur. Conf. Comput. Vis., Glasgow, UK, 2020, pp. 312-328.
- Z. Gao, C. Sun, Z. Cheng, W. Guan, A. Liu, and M. Wang, “TBNet: A Two-Stream Boundary-Aware Network for Generic Image Manipulation Localization,” IEEE Trans. Knowl. Data Eng., 2022.
- J. H. Bappy, A. K. Roy-Chowdhury, J. Bunk, L. Nataraj, and B. S. Manjunath, “Exploiting spatial structure for localizing manipulated image regions,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2017, pp. 4970-4979.
- P. Zhou, B.-C. Chen, X. Han, M. Najibi, and L. Davis, “Generate, segment and replace: Towards generic manipulation segmentation,” in Proc. 34th Conf. Artif. Intell., NY, USA, 2020, pp. 13058-13065.
- X. Liu, Y. Liu, J. Chen, and X. Liu, “PSCC-Net: Progressive spatio-channel correlation network for image manipulation detection and localization,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 11, pp. 7505-7517, 2022.
- X. Chen, C. Dong, J. Ji, J. Cao, and X. Li, “Image manipulation detection by multi-view multi-scale supervision,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 14185-14193.
- Y. Rao and J. Ni, “A deep learning approach to detection of splicing and copy-move forgeries in images,” in Proc. IEEE Int. Workshop Inf. Forensics Secur., 2016, pp. 1-6.
- C. Yang, H. Li, F. Lin, B. Jiang, and H. Zhao, “Constrained R-CNN: A general image manipulation detection model,” in Proc. IEEE Int Conf. multimedia expo, 2020, pp. 1-6.
- Y. Wu, W. AbdAlmageed, and P. Natarajan, “ManTra-Net: Manipulation tracing network for detection and localization of image forgeries with anomalous features,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 9543-9552.
- P. Zhou, X. Han, V. I. Morariu, and L. S. Davis, “Learning rich features for image manipulation detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2018, pp. 1053-1061.
- N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in Proc. Eur. Conf. Comput. Vis., Glasgow, UK, 2020, pp. 213-229.
- X. Pan, Z. Xia, S. Song, L. E. Li, and G. Huang, “3d object detection with pointformer,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 7463-7472.
- X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable DETR: Deformable Transformers for End-to-End Object Detection,” in Proc. Int. Conf. Learn. Representations, 2021.
- S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, and L. Zhang, “Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 6881-6890.
- E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, “SegFormer: Simple and efficient design for semantic segmentation with transformers,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2021, pp. 12077-12090.
- H. Wang, Y. Zhu, H. Adam, A. Yuille, and L. C. Chen, “Max-deeplab: End-to-end panoptic segmentation with mask transformers,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 5463-5474.
- J. Wang, Z. Wu, J. Chen, X. Han, A, Shrivastava, S. N. Lim, and Y. G. Jiang, “Objectformer for image manipulation detection and localization,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 2364-2373.
- Y. Sun, R. Ni, and Y. Zhao, “ET: Edge-Enhanced Transformer for Image Splicing Detection,” IEEE Signal Process Lett., vol. 29, pp. 1232-1236, 2022.
- B. Bayar and M. C. Stamm, “Constrained convolutional neural networks: A new approach towards general purpose image manipulation detection,” IEEE Trans. Inf. Forensics Secur., vol. 13, no. 11, pp. 2691-2706, 2018.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, and N. Houlsby, “An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale,” in Proc. Int. Conf. Learn. Representations, 2021.
- Y. Tay, M. Dehghani, D. Bahri, and D. Metzler, “Efficient transformers: A survey,” ACM Comput. Surv., vol. 55, no. 6, pp. 1-28, 2022.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, and I. Polosukhin, “Attention is all you need,” in Neural Inf. Process. Syst., 2017.
- J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention network for scene segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 3146-3154.
- R. Strudel, R. Garcia, I. Laptev, and C. Schmid, “Segmenter: Transformer for semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 7262-7272.
- Y. Wei, J. Ma, Z. Wang, B. Xiao, and W. Zheng, “Image splicing forgery detection by combining synthetic adversarial networks and hybrid dense U-net based on multiple spaces,” Int. J. Intell. Syst., vol. 37, no. 11, pp. 8291-8308, 2022.
- J. Dong, W. Wang, and T. Tan, “Casia image tampering detection evaluation database,” in Proc. IEEE summit Int. Conf. signal Inf. Process., China, 2013, pp. 422-426.
- B. Zhou, H. Zhao, X. Puig, T. Xiao, S. Fidler, A. Barriuso, and A. Torralba, “Semantic understanding of scenes through the ade20k dataset,” Int. J. Comput. Vision, vol. 127, pp. 302-321, 2019.
- J. Li, N. Wang, L. Zhang, B. Du, and D. Tao, “Recurrent feature reasoning for image inpainting,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 7760-7768.
- “NIST: Nimble 2016 Datasets,” [Online]. Available: https://www.nist.gov/itl/iad/mig/
- A. Novozamsky, B. Mahdian, and S. Saic, “IMD2020: A large-scale annotated dataset tailored for detecting manipulated images,” in Proc. IEEE/CVF Winter Conf. Appl. Comput. Vis. Workshops, 2020, pp. 71-80.
- P. Korus and J. Huang, “Multi-scale analysis strategies in PRNU-based tampering localization,” IEEE Trans. Inf. Forensics Secur., vol. 12, no. 4, pp. 809-824, 2017.
- A. Steiner, A. Kolesnikov, X. Zhai, R. Wightman, J. Uszkoreit, and L. Beyer, “How to train your vit? data, augmentation, and regularization in vision transformers,” arXiv preprint arXiv:2106.10270.