Sem-CS: Semantic CLIPStyler for Text-Based Image Style Transfer (2307.05934v1)
Abstract: CLIPStyler demonstrated image style transfer with realistic textures using only a style text description (instead of requiring a reference style image). However, the ground semantics of objects in the style transfer output is lost due to style spill-over on salient and background objects (content mismatch) or over-stylization. To solve this, we propose Semantic CLIPStyler (Sem-CS), that performs semantic style transfer. Sem-CS first segments the content image into salient and non-salient objects and then transfers artistic style based on a given style text description. The semantic style transfer is achieved using global foreground loss (for salient objects) and global background loss (for non-salient objects). Our empirical results, including DISTS, NIMA and user study scores, show that our proposed framework yields superior qualitative and quantitative performance. Our code is available at github.com/chandagrover/sem-cs.
- “Image style transfer using convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
- “Artistic style transfer with internal-external learning and contrastive learning,” in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, Eds. 2021, Curran Associates, Inc.
- “Universal style transfer via feature transforms,” Advances in neural information processing systems, 2017.
- “Arbitrary style transfer with style-attentional networks,” in proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019.
- “Dilie: Deep internal learning for image enhancement,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022.
- “The contextual loss for image transformation with non-aligned data,” in Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XIV. 2018, Lecture Notes in Computer Science, Springer.
- “Cross-modal style transfer,” in 2018 25th IEEE International Conference on Image Processing (ICIP), 2018.
- “Photorealistic style transfer with screened poisson equation,” in Proceedings of the British Machine Vision Conference (BMVC). 2017, BMVA Press.
- “Deep photo style transfer,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- “A patch-based approach for artistic style transfer via constrained multi-scale image matching,” in 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2022.
- “Clipstyler: Image style transfer with a single text condition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022.
- “Generative artisan: A semantic-aware and controllable clipstyler,” arXiv preprint arXiv:2207.11598, 2022.
- “Deepobjstyle: Deep object-based photo style transfer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2021.
- “Contextclip: Contextual alignment of image-text pairs on clip visual representations,” in Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, 2023.
- “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015.
- “Deep spectral methods: A surprisingly strong baseline for unsupervised semantic segmentation and localization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- “Image quality assessment: Unifying structure and texture similarity,” IEEE transactions on pattern analysis and machine intelligence, 2020.
- “Nima: Neural image assessment,” IEEE transactions on image processing, 2018.
- “Rethinking and improving the robustness of image style transfer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
- “Stylegan-nada: Clip-guided domain adaptation of image generators,” ACM Transactions on Graphics (TOG), 2022.
- “Styleclip: Text-driven manipulation of stylegan imagery,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.