Towards Real-world Video Face Restoration: A New Benchmark (2404.19500v2)
Abstract: Blind face restoration (BFR) on images has significantly progressed over the last several years, while real-world video face restoration (VFR), which is more challenging for more complex face motions such as moving gaze directions and facial orientations involved, remains unsolved. Typical BFR methods are evaluated on privately synthesized datasets or self-collected real-world low-quality face images, which are limited in their coverage of real-world video frames. In this work, we introduced new real-world datasets named FOS with a taxonomy of "Full, Occluded, and Side" faces from mainly video frames to study the applicability of current methods on videos. Compared with existing test datasets, FOS datasets cover more diverse degradations and involve face samples from more complex scenarios, which helps to revisit current face restoration approaches more comprehensively. Given the established datasets, we benchmarked both the state-of-the-art BFR methods and the video super resolution (VSR) methods to comprehensively study current approaches, identifying their potential and limitations in VFR tasks. In addition, we studied the effectiveness of the commonly used image quality assessment (IQA) metrics and face IQA (FIQA) metrics by leveraging a subjective user study. With extensive experimental results and detailed analysis provided, we gained insights from the successes and failures of both current BFR and VSR methods. These results also pose challenges to current face restoration approaches, which we hope stimulate future advances in VFR research.
- Simple online and realtime tracking. In 2016 IEEE International Conference on Image Processing (ICIP), pages 3464–3468, 2016.
- Vggface2: A dataset for recognising faces across pose and age. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pages 67–74. IEEE, 2018.
- Glean: Generative latent bank for large-factor image super-resolution. arXiv: Computer Vision and Pattern Recognition, 2020.
- Basicvsr: The search for essential components in video super-resolution and beyond. Computer Vision and Pattern Recognition, 2021.
- Progressive semantic-aware style transformation for blind face restoration. Cornell University - arXiv, 2020.
- Progressive semantic-aware style transformation for blind face restoration. 2023.
- Fsrnet: End-to-end learning face super-resolution with facial priors. Cornell University - arXiv, 2017.
- Voxceleb2: Deep speaker recognition. Cornell University - arXiv, 2018.
- Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4690–4699, 2019.
- Exemplar guided face image super-resolution without facial landmarks. Computer Vision and Pattern Recognition, 2019.
- Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12873–12883, 2021.
- Self-enhanced convolutional network for facial video hallucination. IEEE transactions on image processing, 2020.
- Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
- Vqfr: Blind face restoration with vector-quantized dictionary and parallel decoder.
- Gcfsr: a generative and controllable face super resolution method without facial and gan priors.
- Faceqnet: Quality assessment for face recognition based on deep learning. In 2019 International Conference on Biometrics (ICB), pages 1–8. IEEE, 2019.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
- Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Workshop on faces in’Real-Life’Images: detection, alignment, and recognition, 2008.
- Ifqa: Interpretable face quality assessment. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3444–3453, 2023.
- Progressive growing of gans for improved quality, stability, and variation. arXiv: Neural and Evolutionary Computing, 2017.
- A style-based generator architecture for generative adversarial networks. arXiv: Neural and Evolutionary Computing, 2018a.
- A style-based generator architecture for generative adversarial networks. arXiv: Neural and Evolutionary Computing, 2018b.
- Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8110–8119, 2020.
- Alias-free generative adversarial networks. In Proc. NeurIPS, 2021.
- Musiq: Multi-scale image quality transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5148–5157, 2021.
- Progressive face super-resolution via attention to facial landmark. arXiv preprint arXiv:1908.08239, 2019.
- Face tracking and recognition with visual constraints in real-world videos. In 2008 IEEE Conference on computer vision and pattern recognition, pages 1–8. IEEE, 2008.
- Learning warped guidance for blind face restoration. Cornell University - arXiv, 2018.
- Blind face restoration via deep multi-scale component dictionaries. Springer International Publishing eBooks, 2020a.
- Blind face restoration via deep multi-scale component dictionaries. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IX 16, pages 399–415. Springer, 2020b.
- Enhanced blind face restoration with multi-exemplar images and adaptive spatial feature fusion. 2020c.
- Learning dual memory dictionaries for blind face restoration. 2022.
- Parsenet: Looking wider to see better. arXiv preprint arXiv:1506.04579, 2015.
- Learning to have an ear for face super-resolution. arXiv: Computer Vision and Pattern Recognition, 2019.
- Pulse: Self-supervised photo upsampling via latent space exploration of generative models. 2020.
- No-reference image quality assessment in the spatial domain. IEEE Transactions on image processing, 21(12):4695–4708, 2012a.
- Making a “completely blind” image quality analyzer. IEEE Signal processing letters, 20(3):209–212, 2012b.
- Head pose estimation in computer vision: A survey. IEEE transactions on pattern analysis and machine intelligence, 31(4):607–626, 2008.
- Voxceleb: a large-scale speaker identification dataset. 2023.
- Sdd-fiqa: unsupervised face image quality assessment with similarity distribution distance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7670–7679, 2021.
- Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems, 32, 2019.
- Fine-grained head pose estimation without keypoints. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 2074–2083, 2018.
- Hyperextended lightface: A facial attribute analysis framework. In 2021 International Conference on Engineering and Emerging Technologies (ICEET), pages 1–4. IEEE, 2021.
- Blindly assess image quality in the wild guided by a self-adaptive hyper network. Computer Vision and Pattern Recognition, 2020.
- Ser-fiq: Unsupervised estimation of face image quality based on stochastic embedding robustness. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5651–5660, 2020.
- Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
- Edvr: Video restoration with enhanced deformable convolutional networks. arXiv: Computer Vision and Pattern Recognition, 2019.
- Towards real-world blind face restoration with generative facial prior. Computer Vision and Pattern Recognition, 2021.
- Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
- Restoreformer: High-quality blind face restoration from undegraded key-value pairs. 2022.
- Face recognition in unconstrained videos with matched background similarity. In CVPR 2011, pages 529–534. IEEE, 2011.
- Vfhq: A high-quality dataset and benchmark for video face super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 657–666, 2022.
- Video face super-resolution with motion-adaptive feedback cell. Proceedings of the … AAAI Conference on Artificial Intelligence, 2020.
- Single-Image Super-Resolution: A Benchmark, page 372–386. 2014.
- Hifacegan: Face renovation via collaborative suppression and replenishment. arXiv: Computer Vision and Pattern Recognition, 2020.
- Maniqa: Multi-dimension attention network for no-reference image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1191–1200, 2022.
- Gan prior embedded network for blind face restoration in the wild. Cornell University - arXiv, 2021.
- Face super-resolution guided by facial component heatmaps. In Proceedings of the European conference on computer vision (ECCV), pages 217–233, 2018.
- Blind face restoration: Benchmark datasets and a baseline model. arXiv preprint arXiv:2206.03697, 2022.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018.
- Towards robust blind face restoration with codebook lookup transformer. 2022.
- Flair: A conditional diffusion framework with applications to face video restoration. arXiv preprint arXiv:2311.15445, 2023.
- Ziyan Chen (17 papers)
- Jingwen He (22 papers)
- Xinqi Lin (3 papers)
- Yu Qiao (563 papers)
- Chao Dong (169 papers)