Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Personalized Restoration via Dual-Pivot Tuning (2312.17234v1)

Published 28 Dec 2023 in cs.CV

Abstract: Generative diffusion models can serve as a prior which ensures that solutions of image restoration systems adhere to the manifold of natural images. However, for restoring facial images, a personalized prior is necessary to accurately represent and reconstruct unique facial features of a given individual. In this paper, we propose a simple, yet effective, method for personalized restoration, called Dual-Pivot Tuning - a two-stage approach that personalize a blind restoration system while maintaining the integrity of the general prior and the distinct role of each component. Our key observation is that for optimal personalization, the generative model should be tuned around a fixed text pivot, while the guiding network should be tuned in a generic (non-personalized) manner, using the personalized generative model as a fixed ``pivot". This approach ensures that personalization does not interfere with the restoration process, resulting in a natural appearance with high fidelity to the person's identity and the attributes of the degraded image. We evaluated our approach both qualitatively and quantitatively through extensive experiments with images of widely recognized individuals, comparing it against relevant baselines. Surprisingly, we found that our personalized prior not only achieves higher fidelity to identity with respect to the person's identity, but also outperforms state-of-the-art generic priors in terms of general image quality. Project webpage: https://personalized-restoration.github.io

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. A neural space-time representation for text-to-image personalization. arXiv preprint arXiv:2305.15391, 2023.
  2. Domain-agnostic tuning-encoder for fast personalization of text-to-image models. arXiv preprint arXiv:2307.06925, 2023.
  3. Break-a-scene: Extracting multiple concepts from a single image. arXiv preprint arXiv:2305.16311, 2023.
  4. Instructpix2pix: Learning to follow image editing instructions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18392–18402, 2023.
  5. Super-fan: Integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 109–117, 2018.
  6. Glean: Generative latent bank for large-factor image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14245–14254, 2021.
  7. Progressive semantic-aware style transformation for blind face restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11896–11905, 2021.
  8. Subject-driven text-to-image generation via apprenticeship learning. arXiv preprint arXiv:2304.00186, 2023.
  9. Fsrnet: End-to-end learning face super-resolution with facial priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2492–2501, 2018.
  10. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4690–4699, 2019.
  11. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
  12. Exemplar guided face image super-resolution without facial landmarks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019.
  13. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618, 2022.
  14. Encoder-based domain tuning for fast personalization of text-to-image models. ACM Transactions on Graphics (TOG), 42(4):1–13, 2023.
  15. Vqfr: Blind face restoration with vector-quantized dictionary and parallel decoder. arXiv preprint arXiv:2205.06803, 2022.
  16. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022.
  17. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  18. Imagic: Text-based real image editing with diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6007–6017, 2023.
  19. Progressive face super-resolution via attention to facial landmark. arXiv preprint arXiv:1908.08239, 2019.
  20. Learning warped guidance for blind face restoration. In The European Conference on Computer Vision (ECCV), 2018.
  21. Blind face restoration via deep multi-scale component dictionaries. In European Conference on Computer Vision, pages 399–415. Springer, 2020a.
  22. Enhanced blind face restoration with multi-exemplar images and adaptive spatial feature fusion. In CVPR, 2020b.
  23. Enhanced blind face restoration with multi-exemplar images and adaptive spatial feature fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2706–2715, 2020c.
  24. Learning dual memory dictionaries for blind face restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  25. Diffbir: Towards blind image restoration with generative diffusion prior. arXiv preprint arXiv:2308.15070, 2023.
  26. Cones: Concept neurons in diffusion models for customized generation. arXiv preprint arXiv:2303.05125, 2023.
  27. Time-travel rephotography. ACM Transactions on Graphics (TOG), 40(6):1–12, 2021.
  28. T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models. arXiv preprint arXiv:2302.08453, 2023.
  29. Mystyle: A personalized generative prior. ACM Transactions on Graphics (TOG), 41(6):1–10, 2022.
  30. Orthogonal adaptation for modular customization of diffusion models. arXiv preprint arXiv:2312.02432, 2023a.
  31. State of the art on diffusion models for visual computing. arXiv preprint arXiv:2310.07204, 2023b.
  32. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
  33. Pivotal tuning for latent-based editing of real images. ACM Transactions on graphics (TOG), 42(1):1–13, 2022.
  34. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  35. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22500–22510, 2023a.
  36. Hyperdreambooth: Hypernetworks for fast personalization of text-to-image models. arXiv preprint arXiv:2307.06949, 2023b.
  37. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
  38. Deep semantic face deblurring. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8260–8269, 2018.
  39. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021.
  40. Realfill: Reference-driven generation for authentic image completion. arXiv preprint arXiv:2309.16668, 2023.
  41. Key-locked rank one editing for text-to-image personalization. In ACM SIGGRAPH 2023 Conference Proceedings, pages 1–11, 2023.
  42. p+limit-from𝑝p+italic_p +: Extended textual conditioning in text-to-image generation. arXiv preprint arXiv:2303.09522, 2023.
  43. Exploiting diffusion prior for real-world image super-resolution. arXiv preprint arXiv:2305.07015, 2023a.
  44. Multiple exemplars-based hallucination for face super-resolution and editing. In Proceedings of the Asian Conference on Computer Vision, 2020a.
  45. Imagen editor and editbench: Advancing and evaluating text-guided image inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18359–18369, 2023b.
  46. Towards real-world blind face restoration with generative facial prior. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  47. Deep learning for image super-resolution: A survey. IEEE transactions on pattern analysis and machine intelligence, 43(10):3365–3387, 2020b.
  48. Restoreformer: High-quality blind face restoration from undegraded key-value pairs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17512–17521, 2022.
  49. Dr2: Diffusion-based robust degradation remover for blind face restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1704–1713, 2023c.
  50. Hifacegan: Face renovation via collaborative suppression and replenishment. In Proceedings of the 28th ACM International Conference on Multimedia, pages 1551–1560, 2020.
  51. Gan prior embedded network for blind face restoration in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 672–681, 2021.
  52. Face super-resolution guided by facial component heatmaps. In Proceedings of the European conference on computer vision (ECCV), pages 217–233, 2018.
  53. Difface: Blind face restoration with diffused error contraction. arXiv preprint arXiv:2212.06512, 2022.
  54. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3836–3847, 2023.
  55. Rethinking deep face restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7652–7661, 2022.
  56. Towards robust blind face restoration with codebook lookup transformer. Advances in Neural Information Processing Systems, 35:30599–30611, 2022.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com