Explainable Deepfake Video Detection using Convolutional Neural Network and CapsuleNet (2404.12841v1)
Abstract: Deepfake technology, derived from deep learning, seamlessly inserts individuals into digital media, irrespective of their actual participation. Its foundation lies in machine learning and AI. Initially, deepfakes served research, industry, and entertainment. While the concept has existed for decades, recent advancements render deepfakes nearly indistinguishable from reality. Accessibility has soared, empowering even novices to create convincing deepfakes. However, this accessibility raises security concerns.The primary deepfake creation algorithm, GAN (Generative Adversarial Network), employs machine learning to craft realistic images or videos. Our objective is to utilize CNN (Convolutional Neural Network) and CapsuleNet with LSTM to differentiate between deepfake-generated frames and originals. Furthermore, we aim to elucidate our model's decision-making process through Explainable AI, fostering transparent human-AI relationships and offering practical examples for real-life scenarios.
- “Mesonet: a compact facial video forgery detection network” In 2018 IEEE International Workshop on Information Forensics and Security (WIFS), 2018, pp. 1–7 IEEE
- “Detecting deep-fake videos from phoneme-viseme mismatches” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 660–661
- “Recycle-gan: Unsupervised video retargeting” In Proceedings of the European conference on computer vision (ECCV), 2018, pp. 119–135
- Christopher M Bishop “Pattern recognition and machine learning” springer, 2006
- BuzzFeedVideo “You Won’t Believe What Obama Says In This Video!”, Available: https://www.youtube.com/watch?v=cQ54GDm1eL0, 2018
- “Stargan: Unified generative adversarial networks for multi-domain image-to-image translation” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8789–8797
- François Chollet “Xception: Deep learning with depthwise separable convolutions” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1251–1258
- Umur Aybars Ciftci, Ilke Demir and Lijun Yin “Fakecatcher: Detection of synthetic portrait videos using biological signals” In IEEE Transactions on Pattern Analysis and Machine Intelligence IEEE, 2020
- Edmar RS De Rezende, Guilherme CS Ruppert and Tiago Carvalho “Detecting computer generated images with deep convolutional neural networks” In 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), 2017, pp. 71–78 IEEE
- “Deep generative image models using a laplacian pyramid of adversarial networks” In arXiv preprint arXiv:1506.05751, 2015
- “The deepfake detection challenge dataset” In arXiv preprint arXiv:2006.07397, 2020
- “Text-based editing of talking-head video” In ACM Transactions on Graphics (TOG) 38.4 ACM New York, NY, USA, 2019, pp. 1–14
- Gabriela Galindo “XR Belgium posts deepfake of Belgian premier linking Covid-19 with climate crisis” In The Brussels Times, 2020 URL: https://www.brusselstimes.com/news/belgium-all-news/politics/106320/xr-belgium-posts-deepfake-of-belgian-premier-linking-covid-19-with-climate-crisis/
- “Generative adversarial nets. In NIPS”, 2014
- Luca Guarnera, Oliver Giudice and Sebastiano Battiato “Fighting Deepfake by Exposing the Convolutional Traces on Images” In IEEE Access 8 IEEE, 2020, pp. 165085–165098
- David Güera and Edward J Delp “Deepfake video detection using recurrent neural networks” In 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2018, pp. 1–6 IEEE
- “Deep residual learning for image recognition” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778
- Geoffrey E Hinton, Alex Krizhevsky and Sida D Wang “Transforming auto-encoders” In International conference on artificial neural networks, 2011, pp. 44–51 Springer
- Olivia Holmes, Martin S Banks and Hany Farid “Assessing and improving the identification of computer-generated portraits” In ACM Transactions on Applied Perception (TAP) 13.2 ACM New York, NY, USA, 2016, pp. 1–12
- Chih-Chung Hsu, Yi-Xiu Zhuang and Chia-Yen Lee “Deep fake image detection based on pairwise learning” In Applied Sciences 10.1 Multidisciplinary Digital Publishing Institute, 2020, pp. 370
- Bogdan Iancu “Evaluating Google Speech-to-Text API’s Performance for Romanian e-Learning Resources.” In Informatica Economica 23.1, 2019
- Tackhyun Jung, Sangwon Kim and Keecheon Kim “DeepVision: Deepfakes Detection Using Human Eye Blinking Pattern” In IEEE Access 8 IEEE, 2020, pp. 83144–83154
- “Hiding traces of resampling in digital images” In IEEE Transactions on Information Forensics and Security 3.4 IEEE, 2008, pp. 582–592
- “Fast face-swap using convolutional neural networks” In Proceedings of the IEEE international conference on computer vision, 2017, pp. 3677–3685
- Yuezun Li, Ming-Ching Chang and Siwei Lyu “In ictu oculi: Exposing ai created fake videos by detecting eye blinking” In 2018 IEEE International Workshop on Information Forensics and Security (WIFS), 2018, pp. 1–7 IEEE
- “Exposing deepfake videos by detecting face warping artifacts” In arXiv preprint arXiv:1811.00656, 2018
- Ming-Yu Liu, Thomas Breuel and Jan Kautz “Unsupervised image-to-image translation networks” In arXiv preprint arXiv:1703.00848, 2017
- “The voice conversion challenge 2018: Promoting development of parallel and nonparallel methods” In arXiv preprint arXiv:1804.04262, 2018
- “Detection of gan-generated fake images over social networks” In 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), 2018, pp. 384–389 IEEE
- Akul Mehra “Deepfake detection using capsule networks with long short-term memory networks”, 2020
- Todd K Moon “The expectation-maximization algorithm” In IEEE Signal processing magazine 13.6 IEEE, 1996, pp. 47–60
- Huy H Nguyen, Junichi Yamagishi and Isao Echizen “Capsule-forensics: Using capsule networks to detect forged images and videos” In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 2307–2311 IEEE
- “On resampling detection and its application to detect image tampering” In 2006 IEEE International Conference on Multimedia and Expo, 2006, pp. 1325–1328 IEEE
- Alec Radford, Luke Metz and Soumith Chintala “Unsupervised representation learning with deep convolutional generative adversarial networks” In arXiv preprint arXiv:1511.06434, 2015
- Rajeev Ranjan, Vishal M Patel and Rama Chellappa “Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition” In IEEE transactions on pattern analysis and machine intelligence 41.1 IEEE, 2017, pp. 121–135
- “Faceforensics++: Learning to detect manipulated facial images” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1–11
- “Recurrent convolutional strategies for face manipulation detection in videos” In Interfaces (GUI) 3.1, 2019
- Sara Sabour, Nicholas Frosst and Geoffrey E Hinton “Dynamic routing between capsules” In arXiv preprint arXiv:1710.09829, 2017
- “Learning from simulated and unsupervised images through adversarial training” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2107–2116
- “Very deep convolutional networks for large-scale image recognition” In arXiv preprint arXiv:1409.1556, 2014
- “Eye blink detection using facial landmarks” In 21st computer vision winter workshop, Rimske Toplice, Slovenia, 2016
- Supasorn Suwajanakorn, Steven M Seitz and Ira Kemelmacher-Shlizerman “Synthesizing obama: learning lip sync from audio” In ACM Transactions on Graphics (ToG) 36.4 ACM New York, NY, USA, 2017, pp. 1–13
- “Inception-v4, inception-resnet and the impact of residual connections on learning” In Proceedings of the AAAI Conference on Artificial Intelligence 31.1, 2017
- “Face2face: Real-time face capture and reenactment of rgb videos” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2387–2395
- Eric Tokuda, Helio Pedrini and Anderson Rocha “Computer generated images vs. digital photographs: A synergetic feature and classifier combination approach” In Journal of Visual Communication and Image Representation 24.8 Elsevier, 2013, pp. 1276–1292
- “FakeSpotter: A simple yet robust baseline for spotting AI-synthesized fake faces” In arXiv preprint arXiv:1909.06122, 2019
- Xin Yang, Yuezun Li and Siwei Lyu “Exposing deep fakes using inconsistent head poses” In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 8261–8265 IEEE
- Xu Zhang, Svebor Karaman and Shih-Fu Chang “Detecting and simulating artifacts in gan fake images” In 2019 IEEE International Workshop on Information Forensics and Security (WIFS), 2019, pp. 1–6 IEEE
- “Unpaired image-to-image translation using cycle-consistent adversarial networks” In Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223–2232
- Gazi Hasin Ishrak (1 paper)
- Zalish Mahmud (3 papers)
- Tahera Khanom Tinni (1 paper)
- Tanzim Reza (2 papers)
- Mohammad Zavid Parvez (1 paper)
- Md. Zami Al Zunaed Farabe (2 papers)