Towards Multi-domain Face Landmark Detection with Synthetic Data from Diffusion model (2401.13191v1)
Abstract: Recently, deep learning-based facial landmark detection for in-the-wild faces has achieved significant improvement. However, there are still challenges in face landmark detection in other domains (e.g. cartoon, caricature, etc). This is due to the scarcity of extensively annotated training data. To tackle this concern, we design a two-stage training approach that effectively leverages limited datasets and the pre-trained diffusion model to obtain aligned pairs of landmarks and face in multiple domains. In the first stage, we train a landmark-conditioned face generation model on a large dataset of real faces. In the second stage, we fine-tune the above model on a small dataset of image-landmark pairs with text prompts for controlling the domain. Our new designs enable our method to generate high-quality synthetic paired datasets from multiple domains while preserving the alignment between landmarks and facial features. Finally, we fine-tuned a pre-trained face landmark detection model on the synthetic dataset to achieve multi-domain face landmark detection. Our qualitative and quantitative results demonstrate that our method outperforms existing methods on multi-domain face landmark detection.
- “Animating portrait line drawings from a single face photo and a speech signal,” in ACM SIGGRAPH 2022 Conference Proceedings, 2022, pp. 1–8.
- “Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2019, pp. 0–0.
- “I know how you feel: Emotion recognition with facial landmarks,” in proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2018, pp. 1878–1880.
- “Facial landmark detection by deep multi-task learning,” in Proceedings of the European Conference on Computer Vision (ECCV). Springer, 2014, pp. 94–108.
- “How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,000 3d facial landmarks),” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 1021–1030.
- “Style aggregated network for facial landmark detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 379–388.
- “Wing loss for robust facial landmark localisation with convolutional neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 2235–2245.
- “Adnet: Leveraging error-bias towards normal direction in face alignment,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2021, pp. 3080–3090.
- “Star loss: Reducing semantic ambiguity in facial landmark detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 15475–15484.
- “The face of art: landmark detection and geometric style in portraits,” ACM Transactions on Graphics (TOG), vol. 38, no. 4, pp. 1–15, 2019.
- “300 faces in-the-wild challenge: Database and results,” Image and vision computing, vol. 47, pp. 3–18, 2016.
- “A neural algorithm of artistic style,” arXiv preprint arXiv:1508.06576, 2015.
- “Artfacepoints: High-resolution facial landmark detection in paintings and prints,” in Proceedings of the European Conference on Computer Vision (ECCV). Springer, 2022, pp. 298–313.
- “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2223–2232.
- “Adverse weather image translation with asymmetric and uncertainty-aware gan,” The 32th British Machine Vision Conference, 2021.
- “Adaptive content feature enhancement gan for multimodal selfie to anime translation,” The 32th British Machine Vision Conference, 2021.
- “Landmark detection and 3d face reconstruction for caricature using a nonlinear parametric model,” Graphical Models, vol. 115, pp. 101103, 2021.
- “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 10684–10695.
- “Sdxl: Improving latent diffusion models for high-resolution image synthesis,” arXiv preprint arXiv:2307.01952, 2023.
- “Adding conditional control to text-to-image diffusion models,” arXiv preprint arXiv:2302.05543, 2023.
- “A style-based generator architecture for generative adversarial networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4401–4410.
- “Denoising diffusion probabilistic models,” Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 6840–6851, 2020.
- “Taming transformers for high-resolution image synthesis,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 12873–12883.
- “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020.
- “Classifier-free diffusion guidance,” arXiv preprint arXiv:2207.12598, 2022.
- “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- “Pastiche master: Exemplar-based high-resolution portrait style transfer,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Yuanming Li (13 papers)
- Gwantae Kim (8 papers)
- Jeong-gi Kwak (9 papers)
- Bon-hwa Ku (1 paper)
- Hanseok Ko (38 papers)