Label-guided Facial Retouching Reversion (2404.14177v2)
Abstract: With the popularity of social media platforms and retouching tools, more people are beautifying their facial photos, posing challenges for fields requiring photo authenticity. To address this issue, some work has proposed makeup removal methods, but they cannot revert images involving geometric deformations caused by retouching. To tackle the problem of facial retouching reversion, we propose a framework, dubbed Re-Face, which consists of three components: a facial retouching detector, an image reversion model named FaceR, and a color correction module called Hierarchical Adaptive Instance Normalization (H-AdaIN). FaceR can utilize labels generated by the facial retouching detector as guidance to revert the retouched facial images. Then, color correction is performed using H-AdaIN to address the issue of color shift. Extensive experiments demonstrate the effectiveness of our framework and each module.
- Detecting facial retouching using supervised deep learning. IEEE Transactions on Information Forensics and Security 11, 9 (2016), 1903–1913.
- Large Scale GAN Training for High Fidelity Natural Image Synthesis. arXiv:1809.11096 [cs.LG]
- InstructPix2Pix: Learning To Follow Image Editing Instructions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 18392–18402.
- Makeup-go: Blind reversion of portrait edit. In Proceedings of the IEEE International Conference on Computer Vision. 4501–4509.
- Domain Adaptive Image-to-Image Translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Perception prioritized training of diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11472–11481.
- Can facial cosmetics affect the matching accuracy of face recognition systems?. In 2012 IEEE Fifth international conference on biometrics: theory, applications and systems (BTAS). IEEE, 391–398.
- Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat gans on image synthesis. Advances in neural information processing systems 34 (2021), 8780–8794.
- Image quality assessment: Unifying structure and texture similarity. IEEE transactions on pattern analysis and machine intelligence 44, 5 (2020), 2567–2581.
- An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion. arXiv:2208.01618 [cs.CV]
- Masked face recognition with generative data augmentation and domain constrained ranking. In Proceedings of the 28th ACM international conference on multimedia. 2246–2254.
- Ladn: Local adversarial disentangling network for facial makeup and de-makeup. In Proceedings of the IEEE/CVF International conference on computer vision. 10481–10490.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30 (2017).
- Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840–6851.
- Jonathan Ho and Tim Salimans. 2022. Classifier-Free Diffusion Guidance. arXiv:2207.12598 [cs.LG]
- Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708.
- Xun Huang and Serge Belongie. 2017. Arbitrary Style Transfer in Real-Time With Adaptive Instance Normalization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).
- Detecting GANs and retouching based digital alterations via DAD-HCNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 672–673.
- On detecting GANs and retouching based synthetic alterations. In 2018 IEEE 9th international conference on biometrics theory, applications and systems (BTAS). IEEE, 1–7.
- Categorical Reparameterization with Gumbel-Softmax. arXiv:1611.01144 [stat.ML]
- Diederik P. Kingma and Jimmy Ba. 2017. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs.LG]
- Multi-concept customization of text-to-image diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1931–1941.
- Beautygan: Instance-level facial makeup transfer with deep generative adversarial network. In Proceedings of the 26th ACM international conference on Multimedia. 645–653.
- Psgan++: Robust detail-preserving makeup transfer and removal. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 11 (2021), 8538–8551.
- T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 4296–4304.
- Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748–8763.
- PRNU-based detection of facial retouching. IET Biometrics 9, 4 (2020), 154–164.
- Differential detection of facial retouching: A multi-biometric approach. IEEE Access 8 (2020), 106373–106385.
- Handbook of digital face manipulation and detection: from DeepFakes to morphing attacks. Springer Nature.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10684–10695.
- U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer, 234–241.
- Imagenet large scale visual recognition challenge. International journal of computer vision 115 (2015), 211–252.
- IPDCN2: Improvised Patch-based Deep CNN for facial retouching detection. Expert Systems with Applications 211 (2023), 118612.
- Score-Based Generative Modeling through Stochastic Differential Equations. arXiv:2011.13456 [cs.LG]
- Ssat: A symmetric semantic-aware transformer network for makeup transfer and removal. In Proceedings of the AAAI Conference on artificial intelligence, Vol. 36. 2325–2334.
- Exploiting Diffusion Prior for Real-World Image Super-Resolution. arXiv:2305.07015 [cs.CV]
- Pretraining is All You Need for Image-to-Image Translation. arXiv:2205.12952 [cs.CV]
- Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600–612.
- Deep spatial gradient and temporal depth learning for face anti-spoofing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5042–5051.
- Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation. In SIGGRAPH Asia 2023 Conference Papers (¡conf-loc¿, ¡city¿Sydney¡/city¿, ¡state¿NSW¡/state¿, ¡country¿Australia¡/country¿, ¡/conf-loc¿) (SA ’23). Association for Computing Machinery, New York, NY, USA, Article 95, 11 pages. https://doi.org/10.1145/3610548.3618160
- RetouchingFFHQ: A Large-scale Dataset for Fine-grained Face Retouching Detection. In Proceedings of the 31st ACM International Conference on Multimedia. 737–746.
- Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3836–3847.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586–595.