StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN (2403.14186v1)
Abstract: We propose a method that can generate cinemagraphs automatically from a still landscape image using a pre-trained StyleGAN. Inspired by the success of recent unconditional video generation, we leverage a powerful pre-trained image generator to synthesize high-quality cinemagraphs. Unlike previous approaches that mainly utilize the latent space of a pre-trained StyleGAN, our approach utilizes its deep feature space for both GAN inversion and cinemagraph generation. Specifically, we propose multi-scale deep feature warping (MSDFW), which warps the intermediate features of a pre-trained StyleGAN at different resolutions. By using MSDFW, the generated cinemagraphs are of high resolution and exhibit plausible looping animation. We demonstrate the superiority of our method through user studies and quantitative comparisons with state-of-the-art cinemagraph generation methods and a video generation method that uses a pre-trained StyleGAN.
- Image2stylegan: How to embed images into the stylegan latent space? In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4432–4441, 2019a.
- Image2stylegan++: How to edit the embedded images? CoRR, abs/1911.11544, 2019b.
- Third time’s the charm? image and video editing with stylegan3. arXiv preprint arXiv:2201.13433, 2022.
- Blowing in the wind: Cyclenet for human cinemagraphs from still images, 2023.
- Animating pictures with stochastic motion textures. In ACM SIGGRAPH 2005 Papers, pages 853–860. 2005.
- Animating landscape: self-supervised learning of decoupled motion and appearance for single-image video synthesis. ACM Transactions on Graphics (Proc. of SIGGRAPH ASIA 2019), 38(6):175:1–175:19, 2019.
- Stylevideogan: A temporal generative model using a pretrained stylegan, 2021.
- Endless loops: detecting and animating periodic patterns in still images. ACM Transactions on Graphics (TOG), 40(4):1–12, 2021.
- Animating pictures with eulerian motion fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5810–5819, 2021.
- On the "steerability" of generative adversarial networks. In International Conference on Learning Representations, 2020.
- Progressive growing of gans for improved quality, stability, and variation. CoRR, abs/1710.10196, 2017.
- A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019.
- Alias-free generative adversarial networks. In Proc. NeurIPS, 2021.
- Alexander Kristoffersen. Loopnerf: Exploring temporal compression for 3d video textures. Master’s thesis, EECS Department, University of California, Berkeley, 2023.
- 3d cinemagraphy from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4595–4605, 2023.
- Anycost gans for interactive image synthesis and editing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Deeplandscape: Adversarial modeling of landscape videos. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
- Controllable animation of fluid elements in still images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Text-guided synthesis of eulerian cinemagraphs. 2023.
- Animating pictures of fluid using video examples. In Computer Graphics Forum, pages 677–686. Wiley Online Library, 2009.
- Creating fluid animation from a single image using video database. In Computer Graphics Forum, pages 1973–1982. Wiley Online Library, 2011.
- Animating pictures of water scenes using video retrieval. The Visual Computer, 34(3):347–358, 2018.
- Dynca: Real-time dynamic texture synthesis using neural cellular automata. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Spatially-adaptive multilayer selection for gan inversion and editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- A phase-based approach for animating images using video examples. In Computer Graphics Forum, pages 303–311. Wiley Online Library, 2017.
- Encoding in style: a stylegan encoder for image-to-image translation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Train sparsely, generate densely: Memory-efficient unsupervised training of high-resolution temporal gan, 2020.
- Maximilian Seitzer. pytorch-fid: FID Score for PyTorch. https://github.com/mseitzer/pytorch-fid, 2020. Version 0.3.0.
- Styleportraitvideo: Editing portrait videos with expression optimization. 41(7), 2022.
- Claude E. Shannon. Coding Theorems for a Discrete Source With a Fidelity CriterionInstitute of Radio Engineers, International Convention Record, vol. 7, 1959., pages 325–350. 1993.
- Aligning latent and image spaces to connect the unconnectable. arXiv preprint arXiv:2104.06954, 2021.
- Water simulation and rendering from a still photograph. In SIGGRAPH Asia 2022 Conference Papers, New York, NY, USA, 2022. Association for Computing Machinery.
- Two-stream convolutional networks for dynamic texture synthesis. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- A good image generator is what you need for high-resolution video synthesis. In International Conference on Learning Representations, 2021.
- Designing an encoder for stylegan image manipulation. ACM Transactions on Graphics (TOG), 40(4):1–14, 2021.
- MoCoGAN: Decomposing motion and content for video generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1526–1535, 2018.
- Stitch it in time: Gan-based facial editing of real videos. In SIGGRAPH Asia 2022 Conference Papers, New York, NY, USA, 2022. Association for Computing Machinery.
- High-fidelity gan inversion for image attribute editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- High-resolution image synthesis and semantic manipulation with conditional gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
- Multiscale structural similarity for image quality assessment. The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, 2:1398–1402 Vol.2, 2003.
- Learning to generate time-lapse videos using multi-stage dynamic generative adversarial networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- Temporally consistent semantic video editing. pages 357–374. Springer, 2022.
- A style-based gan encoder for high fidelity reconstruction of images and videos. European conference on computer vision, 2022.
- Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVII, pages 85–101. Springer, 2022.
- The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.
- Datasetgan: Efficient labeled data factory with minimal human effort. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10145–10155, 2021.