Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-modal Document Presentation Attack Detection With Forensics Trace Disentanglement (2404.06663v1)

Published 10 Apr 2024 in cs.CV

Abstract: Document Presentation Attack Detection (DPAD) is an important measure in protecting the authenticity of a document image. However, recent DPAD methods demand additional resources, such as manual effort in collecting additional data or knowing the parameters of acquisition devices. This work proposes a DPAD method based on multi-modal disentangled traces (MMDT) without the above drawbacks. We first disentangle the recaptured traces by a self-supervised disentanglement and synthesis network to enhance the generalization capacity in document images with different contents and layouts. Then, unlike the existing DPAD approaches that rely only on data in the RGB domain, we propose to explicitly employ the disentangled recaptured traces as new modalities in the transformer backbone through adaptive multi-modal adapters to fuse RGB/trace features efficiently. Visualization of the disentangled traces confirms the effectiveness of the proposed method in different document contents. Extensive experiments on three benchmark datasets demonstrate the superiority of our MMDT method on representing forensic traces of recapturing distortion.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. “Face anti-spoofing via adversarial cross-modality translation,” IEEE TIFS, vol. 16, pp. 2759–2772, 2021.
  2. “Image recapture detection with convolutional and recurrent neural networks,” EI, vol. 2017, no. 7, pp. 87–91, 2017.
  3. “Domain-agnostic document authentication against practical recapturing attacks,” IEEE TIFS, vol. 17, pp. 2890–2905, 2022.
  4. “Two-branch multi-scale deep neural network for generalized document recapture attack detection,” in ICASSP, 2023, pp. 1–5.
  5. “Synthetic id card image generation for improving presentation attack detection,” IEEE TIFS, vol. 18, pp. 1814–1824, 2023.
  6. “Distortion model-based spectral augmentation for generalized recaptured document detection,” IEEE TIFS, vol. 19, pp. 1283–1298, 2023.
  7. “Document liveness challenge dataset (DLC-2021),” Journal of Imaging, vol. 8, no. 7, pp. 181, 2022.
  8. “On disentangling spoof trace for generic face anti-spoofing,” in ECCV. Springer, 2020, pp. 406–422.
  9. “Spoof trace disentanglement for generic face anti-spoofing,” IEEE TPAMI, vol. 45, no. 3, pp. 3813–3830, 2022.
  10. “Rethinking vision transformer and masked autoencoder in multimodal face anti-spoofing,” IJCV, 2024.
  11. “Searching central difference convolutional networks for face anti-spoofing,” in CVPR, 2020, pp. 5295–5305.
  12. “An image is worth 16×\times×16 words: Transformers for image recognition at scale,” in ICLR, 2021.
  13. “Beit: Bert pre-training of image transformers,” in ICLR, 2022.
  14. “Multi-modal face anti-spoofing based on central difference networks,” in CVPRW, 2020, pp. 650–651.
  15. “Integrally pre-trained transformer pyramid networks,” in CVPR, 2023, pp. 18610–18620.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com