Exposing Lip-syncing Deepfakes from Mouth Inconsistencies (2401.10113v2)
Abstract: A lip-syncing deepfake is a digitally manipulated video in which a person's lip movements are created convincingly using AI models to match altered or entirely new audio. Lip-syncing deepfakes are a dangerous type of deepfakes as the artifacts are limited to the lip region and more difficult to discern. In this paper, we describe a novel approach, LIP-syncing detection based on mouth INConsistency (LIPINC), for lip-syncing deepfake detection by identifying temporal inconsistencies in the mouth region. These inconsistencies are seen in the adjacent frames and throughout the video. Our model can successfully capture these irregularities and outperforms the state-of-the-art methods on several benchmark deepfake datasets. Code is available at https://github.com/skrantidatta/LIPINC
- “Fast face-swap using convolutional neural networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 3677–3685.
- “Faceforensics++: Learning to detect manipulated facial images,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1–11.
- “A lip sync expert is all you need for speech to lip generation in the wild,” in Proceedings of the 28th ACM international conference on multimedia, 2020, pp. 484–492.
- “Fnevr: Neural volume rendering for face animation,” Advances in Neural Information Processing Systems, vol. 35, pp. 22451–22462, 2022.
- “Fakeavceleb: A novel audio-video multimodal deepfake dataset,” arXiv preprint arXiv:2108.05080, 2021.
- I Sundström, “Deepfake detection by humans: Face swap versus lip sync,” 2023.
- “Exploring temporal coherence for more general video face forgery detection,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 15044–15054.
- “In ictu oculi: Exposing ai created fake videos by detecting eye blinking,” in 2018 IEEE International workshop on information forensics and security (WIFS), 2018, pp. 1–7.
- Y Li and S Lyu, “Exposing deepfake videos by detecting face warping artifacts,” arXiv preprint arXiv:1811.00656, 2018.
- “Ganprintr: Improved fakes and evaluation of the state of the art in face manipulation detection,” IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 5, pp. 1038–1048, 2020.
- “Deep learning for deepfakes creation and detection: A survey,” Computer Vision and Image Understanding, vol. 223, pp. 103525, 2022.
- “Integrating audio-visual features for multimodal deepfake detection,” arXiv preprint arXiv:2310.03827, 2023.
- “Watch those words: Video falsification detection using word-conditioned facial motion,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 4710–4719.
- “Self-supervised video forensics by audio-visual anomaly detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 10491–10503.
- “Lips don’t lie: A generalisable and robust approach to face forgery detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 5039–5049.
- “Deepfakes evolution: Analysis of facial regions and fake detection performance,” in international conference on pattern recognition. Springer, 2021, pp. 442–456.
- “Interpretable and trustworthy deepfake detection via dynamic prototypes,” in Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021, pp. 1973–1983.
- “Deepfake detection based on discrepancies between faces and their context,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 10, pp. 6111–6121, 2021.
- “Implicit identity driven deepfake face swapping detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 4490–4499.
- “Lip sync matters: A novel multimodal forgery detector,” in 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2022, pp. 1885–1892.
- “Celeb-df: A large-scale challenging dataset for deepfake forensics,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 3207–3216.
- “Fsgan: Subject agnostic face swapping and reenactment,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 7184–7193.
- “Transfer learning from speaker verification to multispeaker text-to-speech synthesis,” Advances in neural information processing systems, vol. 31, 2018.
- “Kodf: A large-scale korean deepfake detection dataset,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10744–10753.
- Davis E King, “Dlib-ml: A machine learning toolkit,” The Journal of Machine Learning Research, vol. 10, pp. 1755–1758, 2009.
- “Unsupervised multimodal deepfake detection using intra-and cross-modal inconsistencies,” arXiv preprint arXiv:2311.17088, 2023.
- “Leveraging real talking faces via self-supervision for robust forgery detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 14950–14962.
- “Learning spatiotemporal features with 3d convolutional networks,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 4489–4497.
- “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
- “Audio-driven talking face video generation with learning-based personalized head pose,” arXiv preprint arXiv:2002.10137, 2020.
- “Deepfacelab: Integrated, flexible and extensible face-swapping framework,” arXiv preprint arXiv:2005.05535, 2020.
- “Deep audio-visual speech recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 12, pp. 8717–8727, 2018.
- J Huang and C X Ling, “Using auc and accuracy in evaluating learning algorithms,” IEEE Transactions on knowledge and Data Engineering, vol. 17, no. 3, pp. 299–310, 2005.
- D P Kingma and J Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.