Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning (2402.02003v4)

Published 3 Feb 2024 in cs.CV

Abstract: The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable. Thus, benchmarking and advancing techniques detecting digital manipulation become an urgent issue. Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology, which does not involve the most recent technologies like diffusion. The diversity and quality of images generated by diffusion models have been significantly improved and thus a much more challenging face forgery dataset shall be used to evaluate SOTA forgery detection literature. In this paper, we propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection, which contains a large number of forgery faces generated by advanced generators such as the diffusion-based model and more detailed labels about the manipulation approaches and adopted generators. In addition to evaluating SOTA approaches on our benchmark, we design an innovative cross appearance-edge learning (CAEL) detector to capture multi-grained appearance and edge global representations, and detect discriminative and general forgery traces. Moreover, we devise an appearance-edge cross-attention (AECA) module to explore the various integrations across two domains. Extensive experiment results and visualizations show that our detection model outperforms the state of the arts on different settings like cross-generator, cross-forgery, and cross-dataset evaluations. Code and datasets will be available at \url{https://github.com/Jenine-321/GenFace

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. X. Yang, Y. Li, and S. Lyu, “Exposing deep fakes using inconsistent head poses,” in 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 8261–8265.
  2. R. Wang, F. Juefei-Xu, L. Ma, X. Xie, Y. Huang, J. Wang, and Y. Liu, “Fakespotter: A simple yet robust baseline for spotting ai-synthesized fake faces,” in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI), ser. IJCAI’20, 2021.
  3. A. Rössler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Niessner, “Faceforensics++: Learning to detect manipulated facial images,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1–11.
  4. Y. Li, X. Yang, P. Sun, H. Qi, and S. Lyu, “Celeb-df: A large-scale challenging dataset for deepfake forensics,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020, pp. 3204–3213.
  5. B. Dolhansky, J. Bitton, B. Pflaum, J. Lu, R. Howes, M. Wang, and C. Canton-Ferrer, “The deepfake detection challenge dataset,” ArXiv, vol. abs/2006.07397, 2020.
  6. L. Jiang, R. Li, W. Wu, C. Qian, and C. C. Loy, “Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020, pp. 2886–2895.
  7. H. Dang, F. Liu, J. Stehouwer, X. Liu, and A. K. Jain, “On the detection of digital face manipulation,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 5780–5789.
  8. Y. He, B. Gan, S. Chen, Y. Zhou, G. Yin, L. Song, L. Sheng, J. Shao, and Z. Liu, “Forgerynet: A versatile benchmark for comprehensive forgery analysis,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 4358–4367.
  9. X. Hou, K. Sun, L. Shen, and G. Qiu, “Improving variational autoencoder with deep feature consistent and generative adversarial training,” Neurocomputing, vol. 341, pp. 183–194, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0925231219303157
  10. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems (NIPS), 2014, pp. 2672–2680.
  11. H. Li, X. Hou, Z. Huang, and L. Shen, “Stylegene: Crossover and mutation of region-level facial genes for kinship face synthesis,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 20 960–20 969.
  12. T. Karras, M. Aittala, S. Laine, E. Härkönen, J. Hellsten, J. Lehtinen, and T. Aila, “Alias-free generative adversarial networks,” in Neural Information Processing Systems (NIPS), 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:235606261
  13. T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “Analyzing and improving the image quality of stylegan,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 8107–8116.
  14. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” ArXiv, vol. abs/2006.11239, 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:219955663
  15. A. Croft, “From porn to scams, deepfakes are becoming a big racket and that’s unnerving business leaders and lawmakers,” https://fortune.com/2019/10/07/porn-to-scams-deepfakes-big-racket-unnerving-business-leaders-and-lawmakers/, 2020, accessed: 2020-10-03.
  16. U. C. London, “Deepfakes’ ranked as most serious AI crime threat,” https://www.sciencedaily.com/releases/2020/08/200804085908.htm, 2021, accessed: 2021-05-01.
  17. X. Yao, A. Newson, Y. Gousseau, and P. Hellier, “A latent transformer for disentangled face editing in images and videos,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 13 769–13 778.
  18. W. Huang, S. Tu, and L. Xu, “Ia-faces: A bidirectional method for semantic face editing,” Neural Networks, vol. 158, pp. 272–292, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0893608022004579
  19. M. Pernuš, V. Štruc, and S. Dobrišek, “Maskfacegan: High resolution face editing with masked gan latent code optimization,” IEEE Transactions on Image Processing (TIP), 2023.
  20. Q. Li, W. Wang, C. Xu, and Z. Sun, “Learning disentangled representation for one-shot progressive face swapping,” arXiv preprint arXiv:2203.12985, 2022.
  21. Y. Xu, B. Deng, J. Wang, Y. Jing, J. Pan, and S. He, “High-resolution face swapping via latent semantics disentanglement,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 7632–7641.
  22. T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4396–4405.
  23. X. Guo, X. Liu, Z. Ren, S. Grosz, I. Masi, and X. Liu, “Hierarchical fine-grained image forgery detection and localization,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 3155–3165.
  24. D. Wodajo and S. Atnafu, “Deepfake video detection using convolutional vision transformer,” 2021, arXiv preprint arXiv:2102.11126.
  25. D. A. Coccomini, N. Messina, C. Gennaro, and F. Falchi, “Combining efficientnet and vision transformers for video deepfake detection,” in International Conference on Image Analysis and Processing (ICIAP).   Springer, 2022, pp. 219–229.
  26. F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1800–1807.
  27. C. Si, W. Yu, P. Zhou, Y. Zhou, X. Wang, and S. Yan, “Inception transformer,” in Advances in Neural Information Processing Systems (NIPS), S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35.   Curran Associates, Inc., 2022, pp. 23 495–23 509. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/94e85561a342de88b559b72c9b29f638-Paper-Conference.pdf
  28. N. Kanopoulos, N. Vasanthavada, and R. L. Baker, “Design of an image edge detection filter using the sobel operator,” IEEE Journal of Solid-state Circuits, vol. 23, no. 2, pp. 358–367, 1988.
  29. J. Canny, “A computational approach to edge detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, no. 6, pp. 679–698, 1986.
  30. G. E. Sotak Jr and K. L. Boyer, “The laplacian-of-gaussian kernel: a formal analysis and design procedure for fast, accurate convolution and full-frame output,” Computer Vision, Graphics, and Image Processing, vol. 48, no. 2, pp. 147–189, 1989.
  31. D. Marr and E. Hildreth, “Theory of edge detection,” Proceedings of the Royal Society of London. Series B. Biological Sciences, vol. 207, no. 1167, pp. 187–217, 1980.
  32. N. Ahmed, T. Natarajan, and K. Rao, “Discrete cosine transform,” IEEE Transactions on Computers, vol. C-23, no. 1, pp. 90–93, 1974.
  33. J. Wang, Z. Wu, W. Ouyang, X. Han, J. Chen, Y.-G. Jiang, and S.-N. Li, “M2tr: Multi-modal multi-scale transformers for deepfake detection,” in Proceedings of the 2022 International Conference on Multimedia Retrieval (ICMR), ser. ICMR ’22.   New York, NY, USA: Association for Computing Machinery, 2022, p. 615–623. [Online]. Available: https://doi.org/10.1145/3512527.3531415
  34. T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of GANs for improved quality, stability, and variation,” in International Conference on Learning Representations (ICLR), 2018. [Online]. Available: https://openreview.net/forum?id=Hk99zCeAb
  35. A. Rössler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner, “Faceforensics: A large-scale video dataset for forgery detection in human faces,” CoRR, vol. abs/1803.09179, 2018. [Online]. Available: http://arxiv.org/abs/1803.09179
  36. J. Thies, M. Zollhöfer, M. Stamminger, C. Theobalt, and M. Nießner, “Face2face: Real-time face capture and reenactment of rgb videos,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016, pp. 2387–2395.
  37. A. Rössler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Niessner, “Faceforensics++: Learning to detect manipulated facial images,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019, pp. 1–11.
  38. “Deepfake,” https://github.com/deepfakes/, 2020, accessed: 2020-09-03.
  39. FaceSwap, “Faceswap,” https://github.com/MarekKowalski/FaceSwap, 2020, accessed: 2020-09-03.
  40. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695.
  41. Z. Huang, K. C. Chan, Y. Jiang, and Z. Liu, “Collaborative diffusion for multi-modal face generation and editing,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 6080–6090.
  42. K. Preechakul, N. Chatthee, S. Wizadwongsa, and S. Suwajanakorn, “Diffusion autoencoders: Toward a meaningful and decodable representation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  43. K. Kim, Y. Kim, S. Cho, J. Seo, J. Nam, K. Lee, S. Kim, and K. Lee, “Diffface: Diffusion-based face swapping with facial guidance,” arXiv preprint arXiv:2212.13344, 2022.
  44. Y. Wang, K. Yu, C. Chen, X. Hu, and S. Peng, “Dynamic graph learning with content-guided spatial-frequency relation reasoning for deepfake detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 7278–7287.
  45. Z. Liu, X. Qi, and P. H. Torr, “Global texture enhancement for fake face detection in the wild,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8060–8069.
  46. Z. Wang, J. Bao, W. Zhou, W. Wang, H. Hu, H. Chen, and H. Li, “Dire for diffusion-generated image detection,” in 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 22 388–22 398.
  47. U. Ojha, Y. Li, and Y. J. Lee, “Towards universal fake image detectors that generalize across generative models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 24 480–24 489.
  48. Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 3730–3738.
  49. T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 12, pp. 4217–4228, 2021.
  50. C.-F. R. Chen, Q. Fan, and R. Panda, “Crossvit: Cross-attention multi-scale vision transformer for image classification,” in Proceedings of the IEEE International Conference on Computer Vision (CVPR), October 2021, pp. 357–366.
  51. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
  52. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in International Conference on Learning Representations (ICLR), 2015.
  53. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations (ICLR), Austria, 2021.
  54. Y. Luo, Y. Zhang, J. Yan, and W. Liu, “Generalizing face forgery detection with high-frequency features,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 16 317–16 326.
  55. A. Buslaev, V. I. Iglovikov, E. Khvedchenya, A. Parinov, M. Druzhinin, and A. A. Kalinin, “Albumentations: Fast and flexible image augmentations,” Information, vol. 11, no. 2, 2020.
  56. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in International Conference on Learning Representations (ICLR), San Diego, CA, USA,Conference Track Proceedings, May 2015.
  57. L. Jiang, R. Li, W. Wu, C. Qian, and C. C. Loy, “Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2020, pp. 2886–2895.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yaning Zhang (5 papers)
  2. Zitong Yu (119 papers)
  3. Xiaobin Huang (7 papers)
  4. Linlin Shen (133 papers)
  5. Jianfeng Ren (25 papers)
  6. Tianyi Wang (83 papers)
  7. Zan Gao (19 papers)
Citations (3)
Github Logo Streamline Icon: https://streamlinehq.com

GitHub

X Twitter Logo Streamline Icon: https://streamlinehq.com