Fusion Transformer with Object Mask Guidance for Image Forgery Analysis (2403.12229v2)
Abstract: In this work, we introduce OMG-Fuser, a fusion transformer-based network designed to extract information from various forensic signals to enable robust image forgery detection and localization. Our approach can operate with an arbitrary number of forensic signals and leverages object information for their analysis -- unlike previous methods that rely on fusion schemes with few signals and often disregard image semantics. To this end, we design a forensic signal stream composed of a transformer guided by an object attention mechanism, associating patches that depict the same objects. In that way, we incorporate object-level information from the image. Each forensic signal is processed by a different stream that adapts to its peculiarities. A token fusion transformer efficiently aggregates the outputs of an arbitrary number of network streams and generates a fused representation for each image patch. We assess two fusion variants on top of the proposed approach: (i) score-level fusion that fuses the outputs of multiple image forensics algorithms and (ii) feature-level fusion that fuses low-level forensic traces directly. Both variants exceed state-of-the-art performance on seven datasets for image forgery detection and localization, with a relative average improvement of 12.1% and 20.4% in terms of F1. Our model is robust against traditional and novel forgery attacks and can be expanded with new signals without training from scratch. Our code is publicly available at: https://github.com/mever-team/omgfuser
- Adobe. Jpeg artifacts removal filter, 2023. https://helpx.adobe.com/photoshop/using/quick-actions/jpeg-artifacts-removal.html, accessed 10th Oct. 2023.
- An efficient copy move forgery detection using deep learning feature extraction and matching algorithm. Multimedia Tools and Applications, 79(11-12):7355–7376, 2020.
- Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
- Image forgeries detection through mosaic analysis: the intermediate values algorithm. Image Processing On Line, 11:317–343, 2021.
- Hybrid lstm and encoder–decoder architecture for detection of image forgeries. IEEE Transactions on Image Processing, 28(7):3286–3300, 2019.
- Improved dct coefficient analysis for forgery localization in jpeg images. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2444–2447. IEEE, 2011.
- Operation-wise attention network for tampering localization fusion. In 2021 International Conference on Content-Based Multimedia Indexing (CBMI), pages 1–6. IEEE, 2021.
- Masked-attention mask transformer for universal image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1290–1299, 2022.
- Prnu-based detection of small-size image forgeries. In 2011 17th International Conference on Digital Signal Processing (DSP), pages 1–6. IEEE, 2011.
- Noiseprint: A cnn-based camera model fingerprint. IEEE Transactions on Information Forensics and Security, 15:144–159, 2019.
- Splicebuster: A new blind image splicing detector. In 2015 IEEE International Workshop on Information Forensics and Security (WIFS), pages 1–6. IEEE, 2015.
- Mvss-net: Multi-view multi-scale supervised networks for image manipulation detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
- Casia image tampering detection evaluation database. In 2013 IEEE China summit and international conference on signal and information processing, pages 422–426. IEEE, 2013.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- A novel deep learning framework for copy-moveforgery detection in images. Multimedia Tools and Applications, 79:19167–19192, 2020.
- Eva: Exploring the limits of masked visual representation learning at scale. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19358–19369, 2023.
- Hany Farid. Exposing digital forgeries from jpeg ghosts. IEEE transactions on information forensics and security, 4(1):154–160, 2009.
- Image forgery localization via fine-grained analysis of cfa artifacts. IEEE Transactions on Information Forensics and Security, 7(5):1566–1577, 2012.
- Unsupervised fusion for forgery localization exploiting background information. In 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pages 1–6. IEEE, 2015.
- Behavior knowledge space-based fusion for copy–move forgery detection. IEEE Transactions on Image Processing, 25(10):4729–4742, 2016.
- A framework for decision fusion in image forensics based on dempster–shafer theory of evidence. IEEE transactions on Information Forensics and Security, 8(4):593–607, 2013.
- Rich models for steganalysis of digital images. IEEE Transactions on information Forensics and Security, 7(3):868–882, 2012.
- A review on 2d instance segmentation based on deep neural networks. Image and Vision Computing, 120:104401, 2022.
- Mfc datasets: Large-scale benchmark datasets for media forensic challenge evaluation. In 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), pages 63–72. IEEE, 2019.
- Trufor: Leveraging all-round clues for trustworthy image forgery detection and localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20606–20615, 2023.
- Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5356–5364, 2019.
- A survey on vision transformer. IEEE transactions on pattern analysis and machine intelligence, 45(1):87–110, 2022.
- Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
- Detecting image splicing using geometry invariants and camera characteristics consistency. In 2006 IEEE International Conference on Multimedia and Expo, pages 549–552. IEEE, 2006.
- Span: Spatial pyramid attention network for image manipulation localization. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, pages 312–328. Springer, 2020.
- Learning patch-channel correspondence for interpretable face forgery detection. IEEE Transactions on Image Processing, 32:1668–1680, 2023.
- Deep networks with stochastic depth. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, pages 646–661. Springer, 2016.
- Content-aware detection of jpeg grid inconsistencies for intuitive image forensics. Journal of Visual Communication and Image Representation, 54:155–170, 2018.
- Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. pmlr, 2015.
- Image forgery techniques: a review. Artificial Intelligence Review, 56(2):1577–1625, 2023.
- Segment anything. arXiv preprint arXiv:2304.02643, 2023.
- Multi-scale fusion for improved localization of malicious tampering in digital images. IEEE Transactions on Image Processing, 25(3):1312–1326, 2016a.
- Multi-scale analysis strategies in prnu-based tampering localization. IEEE Transactions on Information Forensics and Security, 12(4):809–824, 2016b.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
- How to fine-tune vision models with sgd. arXiv preprint arXiv:2211.09359, 2022.
- Cat-net: Compression artifact tracing network for detection and localization of image splicing. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 375–384, 2021.
- Learning jpeg compression artifacts for image manipulation detection and localization. International Journal of Computer Vision, 130(8):1875–1895, 2022.
- Openforensics: Large-scale challenging dataset for multi-face forgery detection and segmentation in-the-wild. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10117–10127, 2021.
- What would elsa do? freezing layers during transformer fine-tuning. arXiv preprint arXiv:1911.03090, 2019.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
- Fast, automatic and fine-grained tampered jpeg image detection via dct coefficient analysis. Pattern Recognition, 42(11):2492–2501, 2009.
- Pscc-net: Progressive spatio-channel correlation network for image manipulation detection and localization. IEEE Transactions on Circuits and Systems for Video Technology, 32(11):7505–7517, 2022.
- An improved analysis of stochastic gradient descent with momentum. Advances in Neural Information Processing Systems, 33:18261–18271, 2020.
- Two-stage copy-move forgery detection with self deep matching and proposal superglue. IEEE Transactions on Image Processing, 31:541–555, 2021.
- Tbformer: Two-branch transformer for image forgery localization. IEEE Signal Processing Letters, 2023.
- Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
- Digital camera identification from sensor pattern noise. IEEE Transactions on Information Forensics and Security, 1(2):205–214, 2006.
- Using noise inconsistencies for blind image forensics. Image and Vision Computing, 27(10):1497–1503, 2009.
- Defacto: image and face manipulation dataset. In 2019 27Th european signal processing conference (EUSIPCO), pages 1–5. IEEE, 2019.
- V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth international conference on 3D vision (3DV), pages 565–571. Ieee, 2016.
- Empirical study on optimizer selection for out-of-distribution generalization. arXiv preprint arXiv:2211.08583, 2022.
- Can people identify original and manipulated photos of real-world scenes? Cognitive research: principles and implications, 2(1):1–21, 2017.
- Zero: a local jpeg grid origin detector based on the number of dct zeros and its applications in image forensics. Image Processing On Line, 11:396–433, 2021.
- Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- A comparative study of bayesian and dempster-shafer fusion on image forgery detection. IEEE Access, 10:99268–99281, 2022.
- Exposing digital forgeries in color filter array interpolated images. IEEE Transactions on Signal Processing, 53(10):3948–3959, 2005.
- N Hema Rajini. Image forgery identification using convolution neural network. International Journal of Recent Technology and Engineering, 8(1):311–320, 2019.
- Humans are easily fooled by digital images. Computers & Graphics, 68:142–151, 2017.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
- A multi-stream fusion network for image splicing localization. In MultiMedia Modeling: 29th International Conference, MMM 2023, Bergen, Norway, January 9–12, 2023, Proceedings, Part II, pages 611–622. Springer, 2023.
- Et: Edge-enhanced transformer for image splicing detection. IEEE Signal Processing Letters, 29:1232–1236, 2022.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Luisa Verdoliva. Media forensics and deepfakes: an overview. IEEE Journal of Selected Topics in Signal Processing, 14(5):910–932, 2020.
- ObjectFormer for image manipulation detection and localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Coverage—a novel database for copy-move forgery detection. In 2016 IEEE international conference on image processing (ICIP), pages 161–165. IEEE, 2016.
- The marginal value of adaptive gradient methods in machine learning. Advances in neural information processing systems, 30, 2017.
- Iid-net: Image inpainting detection network via neural architecture search and attention. IEEE Transactions on Circuits and Systems for Video Technology, 32(3):1172–1185, 2021.
- Robust image forgery detection over online social network shared images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13440–13449, 2022.
- Mantra-net: Manipulation tracing network for detection and localization of image forgeries with anomalous features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9543–9552, 2019.
- Early convolutions help transformers see better. Advances in Neural Information Processing Systems, 34:30392–30400, 2021.
- Image forgery detection: a survey of recent deep-learning approaches. Multimedia Tools and Applications, 82(12):17521–17566, 2023.
- Image region forgery detection: A deep learning approach. SG-CRC, 2016:1–11, 2016.
- Prnu-based image forgery localization with deep multi-scale fusion. ACM Transactions on Multimedia Computing, Communications and Applications, 19(2):1–20, 2023.
- A survey on image tampering and its detection in real-world photos. Journal of Visual Communication and Image Representation, 58:380–399, 2019.
- Learning rich features for image manipulation detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1053–1061, 2018.
- Self-adversarial training incorporating forgery attention for image forgery localization. IEEE Transactions on Information Forensics and Security, 17:819–834, 2022.
- Dimitrios Karageorgiou (5 papers)
- Giorgos Kordopatis-Zilos (18 papers)
- Symeon Papadopoulos (74 papers)