APISR: Anime Production Inspired Real-World Anime Super-Resolution (2403.01598v2)
Abstract: While real-world anime super-resolution (SR) has gained increasing attention in the SR community, existing methods still adopt techniques from the photorealistic domain. In this paper, we analyze the anime production workflow and rethink how to use characteristics of it for the sake of the real-world anime SR. First, we argue that video networks and datasets are not necessary for anime SR due to the repetition use of hand-drawing frames. Instead, we propose an anime image collection pipeline by choosing the least compressed and the most informative frames from the video sources. Based on this pipeline, we introduce the Anime Production-oriented Image (API) dataset. In addition, we identify two anime-specific challenges of distorted and faint hand-drawn lines and unwanted color artifacts. We address the first issue by introducing a prediction-oriented compression module in the image degradation model and a pseudo-ground truth preparation with enhanced hand-drawn lines. In addition, we introduce the balanced twin perceptual loss combining both anime and photorealistic high-level features to mitigate unwanted color artifacts and increase visual clarity. We evaluate our method through extensive experiments on the public benchmark, showing our method outperforms state-of-the-art anime dataset-trained approaches.
- Mpeg-4 systems: overview. Signal Processing: Image Communication, 15(4-5):281–298, 2000.
- Matthew Baas. Danbooru2018 pretrained resnet models for pytorch. https://rf5.github.io, 2019. Accessed: DATE.
- Danbooru2019: A large-scale crowdsourced and tagged anime illustration dataset. Danbooru2017, 2019.
- Animediffusion: Anime face line drawing colorization via diffusion models. arXiv preprint arXiv:2303.11137, 2023.
- Diffusart: Enhancing line art colorization with conditional diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3485–3489, 2023.
- Basicvsr: The search for essential components in video super-resolution and beyond. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4947–4956, 2021.
- Investigating tradeoffs in real-world video super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5962–5971, 2022.
- IQA-PyTorch: Pytorch toolbox for image quality assessment. [Online]. Available: https://github.com/chaofengc/IQA-PyTorch, 2022.
- Improving the perceptual quality of 2d animation interpolation. In European Conference on Computer Vision, pages 271–287. Springer, 2022.
- Panic-3d: Stylized single-view 3d reconstruction from portraits of anime characters. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21068–21077, 2023.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence, 38(2):295–307, 2015.
- Ic9600: A benchmark dataset for automatic image complexity assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
- Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
- Animatediff: Animate your personalized text-to-image diffusion models without specific tuning. arXiv preprint arXiv:2307.04725, 2023.
- A technical overview of av1. Proceedings of the IEEE, 109(9):1435–1462, 2021.
- Identity mappings in deep residual networks. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, pages 630–645. Springer, 2016.
- Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017.
- Real-world super-resolution via kernel estimation and noise injection. In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020a.
- Real-world super-resolution via kernel estimation and noise injection. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 466–467, 2020b.
- Scenimefy: Learning to craft anime scene via semi-supervised image-to-image translation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7357–7367, 2023.
- Perceptual losses for real-time style transfer and super-resolution. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pages 694–711. Springer, 2016.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Efficient and explicit modelling of image hierarchies for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18278–18289, 2023.
- Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1833–1844, 2021.
- Mpeg-2 overview. MPEG Video Compression Standard, pages 171–186, 1996.
- No-reference image quality assessment in the spatial domain. IEEE Transactions on image processing, 21(12):4695–4708, 2012a.
- Making a “completely blind” image quality analyzer. IEEE Signal processing letters, 20(3):209–212, 2012b.
- Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, 2018.
- Overview of the scalable video coding extension of the h. 264/avc standard. IEEE Transactions on circuits and systems for video technology, 17(9):1103–1120, 2007.
- Enhanced deep animation video interpolation. In 2022 IEEE International Conference on Image Processing (ICIP), pages 31–35. IEEE, 2022.
- Research on the webp image format. In Advanced graphic communications, packaging technology and materials, pages 271–277. Springer, 2016.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Deep animation video interpolation in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6587–6595, 2021.
- Animerun: 2d animation visual correspondence from open source 3d movies. Advances in Neural Information Processing Systems, 35:18996–19007, 2022.
- Deep geometrized cartoon line inbetweening. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7291–7300, 2023.
- Blindly assess image quality in the wild guided by a self-adaptive hyper network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3667–3676, 2020.
- Overview of the high efficiency video coding (hevc) standard. IEEE Transactions on circuits and systems for video technology, 22(12):1649–1668, 2012.
- Learning data-driven vector-quantized degradation model for animation video super-resolution. arXiv preprint arXiv:2303.09826, 2023.
- Gregory K Wallace. The jpeg still picture compression standard. IEEE transactions on consumer electronics, 38(1):xviii–xxxiv, 1992.
- Vcisr: Blind single image super-resolution with video compression synthetic data. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 4302–4312, 2024.
- Exploring clip for assessing the look and feel of images. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 2555–2563, 2023a.
- Coloring anime line art videos with transformation region enhancement network. Pattern Recognition, 141:109562, 2023b.
- High-resolution image synthesis and semantic manipulation with conditional gans. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8798–8807, 2018a.
- Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European conference on computer vision (ECCV) workshops, pages 0–0, 2018b.
- Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1905–1914, 2021.
- Xdog: An extended difference-of-gaussians compendium including advanced image stylization. Computers & Graphics, 36(6):740–753, 2012.
- Animesr: Learning real-world super-resolution models for animation videos. arXiv preprint arXiv:2206.07038, 2022.
- Space-time video super-resolution using temporal profiles. In Proceedings of the 28th ACM International Conference on Multimedia, pages 664–672, 2020.
- Space-time distillation for video super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2113–2122, 2021.
- A dive into sam prior in image restoration. arXiv preprint arXiv:2305.13620, 2023a.
- Cutmib: Boosting light field super-resolution via multi-view image blending. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1672–1682, 2023b.
- A transformer-based model for super-resolution of anime image. Sensors, 22(21):8126, 2022.
- Maniqa: Multi-dimension attention network for no-reference image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1191–1200, 2022.
- Manga vectorization and manipulation with procedural simple screentone. IEEE transactions on visualization and computer graphics, 23(2):1070–1084, 2016.
- Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 701–710, 2018.
- Designing a practical degradation model for deep blind image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4791–4800, 2021a.
- User-guided line art flat filling with split filling mechanism. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9889–9898, 2021b.
- Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3836–3847, 2023.
- Vectorizing cartoon animations. IEEE Transactions on Visualization and Computer Graphics, 15(4):618–629, 2009.
- Cartoon image processing: A survey. IJCV, 2022.