Papers
Topics
Authors
Recent
Search
2000 character limit reached

Fast Personalized Text-to-Image Syntheses With Attention Injection

Published 17 Mar 2024 in cs.CV | (2403.11284v1)

Abstract: Currently, personalized image generation methods mostly require considerable time to finetune and often overfit the concept resulting in generated images that are similar to custom concepts but difficult to edit by prompts. We propose an effective and fast approach that could balance the text-image consistency and identity consistency of the generated image and reference image. Our method can generate personalized images without any fine-tuning while maintaining the inherent text-to-image generation ability of diffusion models. Given a prompt and a reference image, we merge the custom concept into generated images by manipulating cross-attention and self-attention layers of the original diffusion model to generate personalized images that match the text description. Comprehensive experiments highlight the superiority of our method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. “An image is worth one word: Personalizing text-to-image generation using textual inversion,” 2022.
  2. “LoRA: Low-rank adaptation of large language models,” in International Conference on Learning Representations, 2022.
  3. “Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation,” 2022.
  4. “Multi-concept customization of text-to-image diffusion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1931–1941.
  5. “Dreamartist: Towards controllable one-shot text-to-image generation via contrastive prompt-tuning,” arXiv preprint arXiv:2211.11337, 2022.
  6. “Svdiff: Compact parameter space for diffusion fine-tuning,” arXiv preprint arXiv:2303.11305, 2023.
  7. “Break-a-scene: Extracting multiple concepts from a single image,” arXiv preprint arXiv:2305.16311, 2023.
  8. “Elite: Encoding visual concepts into textual embeddings for customized text-to-image generation,” arXiv preprint arXiv:2302.13848, 2023.
  9. “Designing an encoder for fast personalization of text-to-image models,” arXiv preprint arXiv:2302.12228, 2023.
  10. “Domain-agnostic tuning-encoder for fast personalization of text-to-image models,” arXiv preprint arXiv:2307.06925, 2023.
  11. “Instantbooth: Personalized text-to-image generation without test-time finetuning,” arXiv preprint arXiv:2304.03411, 2023.
  12. “Face0: Instantaneously conditioning a text-to-image model on a face,” arXiv preprint arXiv:2306.06638, 2023.
  13. “Taming encoder for zero fine-tuning image customization with text-to-image diffusion models,” arXiv preprint arXiv:2304.02642, 2023.
  14. “Prompt-to-prompt image editing with cross attention control,” arXiv preprint arXiv:2208.01626, 2022.
  15. “Null-text inversion for editing real images using guided diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 6038–6047.
  16. “Zero-shot image-to-image translation,” arXiv preprint arXiv:2302.03027, 2023.
  17. “Learning transferable visual models from natural language supervision,” 2021.
  18. “A style-based generator architecture for generative adversarial networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4401–4410.
  19. “Stargan v2: Diverse image synthesis for multiple domains,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8188–8197.
  20. “Clipscore: A reference-free evaluation metric for image captioning,” 2022.
  21. Davis E. King, “Dlib-ml: A machine learning toolkit,” Journal of Machine Learning Research, vol. 10, no. 3, pp. 1755–1758, 2009.
  22. ,” jun 2015, IEEE.
  23. “Laion-5b: An open large-scale dataset for training next generation image-text models,” 2022.
Citations (3)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.