iCONTRA: Toward Thematic Collection Design Via Interactive Concept Transfer (2403.08746v1)
Abstract: Creating thematic collections in industries demands innovative designs and cohesive concepts. Designers may face challenges in maintaining thematic consistency when drawing inspiration from existing objects, landscapes, or artifacts. While AI-powered graphic design tools offer help, they often fail to generate cohesive sets based on specific thematic concepts. In response, we introduce iCONTRA, an interactive CONcept TRAnsfer system. With a user-friendly interface, iCONTRA enables both experienced designers and novices to effortlessly explore creative design concepts and efficiently generate thematic collections. We also propose a zero-shot image editing algorithm, eliminating the need for fine-tuning models, which gradually integrates information from initial objects, ensuring consistency in the generation process without influencing the background. A pilot study suggests iCONTRA's potential to reduce designers' efforts. Experimental results demonstrate its effectiveness in producing consistent and high-quality object concept transfers. iCONTRA stands as a promising tool for innovation and creative exploration in thematic collection design. The source code will be available at: https://github.com/vdkhoi20/iCONTRA.
- MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 22560–22570.
- Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models. arXiv:2301.13826 [cs.CV]
- Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models. arXiv:2211.07825 [cs.CV]
- Prompt-to-Prompt Image Editing with Cross Attention Control. arXiv preprint arXiv:2208.01626 (2022).
- Multimodal unsupervised image-to-image translation. In Proceedings of the European conference on computer vision (ECCV). 172–189.
- Neural Style Transfer: A Review. IEEE Transactions on Visualization and Computer Graphics 26, 11 (2020), 3365–3385. https://doi.org/10.1109/TVCG.2019.2921336
- Imagic: Text-Based Real Image Editing with Diffusion Models. arXiv:2210.09276 [cs.CV]
- DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2426–2435.
- Segment Anything. arXiv:2304.02643 [cs.CV]
- VIDES: Virtual Interior Design via Natural Language and Visual Guidance. arXiv:2308.13795 [cs.CV]
- Demystifying Neural Style Transfer. arXiv:1701.01036 [cs.CV]
- Unsupervised image-to-image translation networks. Advances in neural information processing systems 30 (2017).
- Null-text Inversion for Editing Real Images using Guided Diffusion Models. arXiv preprint arXiv:2211.09794 (2022).
- GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. arXiv:2112.10741 [cs.CV]
- Zero-shot Image-to-Image Translation. arXiv:2302.03027 [cs.CV]
- Hierarchical Text-Conditional Image Generation with CLIP Latents. arXiv:2204.06125 [cs.CV]
- High-Resolution Image Synthesis with Latent Diffusion Models. arXiv:2112.10752 [cs.CV]
- DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 22500–22510.
- Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. arXiv:2205.11487 [cs.CV]
- Dinh-Khoi Vo (5 papers)
- Duy-Nam Ly (3 papers)
- Khanh-Duy Le (8 papers)
- Tam V. Nguyen (38 papers)
- Minh-Triet Tran (70 papers)
- Trung-Nghia Le (42 papers)