Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models (2403.08266v1)

Published 13 Mar 2024 in cs.CV and cs.GR

Abstract: While manga is a popular entertainment form, creating manga is tedious, especially adding screentones to the created sketch, namely manga screening. Unfortunately, there is no existing method that tailors for automatic manga screening, probably due to the difficulty of generating high-quality shaded high-frequency screentones. The classic manga screening approaches generally require user input to provide screentone exemplars or a reference manga image. The recent deep learning models enables the automatic generation by learning from a large-scale dataset. However, the state-of-the-art models still fail to generate high-quality shaded screentones due to the lack of a tailored model and high-quality manga training data. In this paper, we propose a novel sketch-to-manga framework that first generates a color illustration from the sketch and then generates a screentoned manga based on the intensity guidance. Our method significantly outperforms existing methods in generating high-quality manga with shaded high-frequency screentones.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. “Richness-preserving manga screening,” ACM Transactions on Graphics (TOG), vol. 27, no. 5, pp. 1–8, 2008.
  2. “Content-sensitive screening in black and white,” in International Conference on Computer Graphics Theory and Applications, 2011, vol. 2, pp. 166–172.
  3. “Mangawall: Generating manga pages for real-time applications,” in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014, pp. 679–683.
  4. “Manga filling style conversion with screentone variational autoencoder,” ACM Transactions on Graphics (TOG), vol. 39, no. 6, pp. 1–15, 2020.
  5. “Generating manga from illustrations via mimicking manga creation workflow,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 5642–5651.
  6. “Shading-guided manga screening from reference,” IEEE Transactions on Visualization and Computer Graphics, 2023.
  7. “Reference-based screentone transfer via pattern correspondence and regularization,” in Computer Graphics Forum, 2023.
  8. “High-resolution image synthesis with latent diffusion models,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 10674–10685.
  9. “Adding conditional control to text-to-image diffusion models,” 2023.
  10. “Two-stage sketch colorization,” ACM Transactions on Graphics (TOG), vol. 37, no. 6, pp. 1–14, 2018.
  11. “Language-based colorization of scene sketches,” ACM Transactions on Graphics (TOG), vol. 38, no. 6, pp. 1–16, 2019.
  12. “User-guided line art flat filling with split filling mechanism,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 9889–9898.
  13. “Synthesis of screentone patterns of manga characters,” in 2019 IEEE international symposium on multimedia (ISM). IEEE, 2019, pp. 212–2123.
  14. “Designing a better asymmetric vqgan for stablediffusion,” 2023.
  15. “Manga109 dataset and creation of metadata,” in Proceedings of the 1st International Workshop on CoMics ANalysis, Processing and Understanding, 2016.
  16. “Exploiting aliasing for manga restoration,” in The IEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 13405–13414.
  17. “Danbooru2021: A large-scale crowdsourced and tagged anime illustration dataset,” https://gwern.net/danbooru2021, 2022.
  18. “Generative adversarial networks,” Communications of the ACM, vol. 63, no. 11, pp. 139–144, 2020.
  19. “The unreasonable effectiveness of deep features as a perceptual metric,” in CVPR, 2018.
  20. “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125–1134.
  21. “Generating manga from illustrations via mimicking manga creation workflow,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

Summary

  • The paper presents a two-step diffusion model that first converts sketches into color illustrations before applying intensity-guided screentones.
  • The framework leverages a high-quality dataset of 289,000 manga images and finetuned diffusion models to accurately capture domain-specific screentone patterns.
  • Experimental results demonstrate that the approach significantly outperforms methods like pix2pix and Screentone Synthesis in generating aesthetically consistent manga outputs.

Overview of "Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models"

The paper "Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models" addresses the intricate challenge of automating the manga screening process, specifically the transformation of sketches into manga with high-quality shaded screentones. The authors propose an innovative sketch-to-manga framework utilizing a two-step diffusion model that significantly surpasses existing methodologies in producing visually appealing manga illustrations with shaded high-frequency screentones.

Technical Contributions

The authors introduce a two-step framework that first generates a color illustration from a sketch, serving as an intermediary stage, and subsequently produces a screentoned manga guided by intensity. Key contributions of the paper include:

  1. Diffusion-Based Framework: The proposed method is grounded on a diffusion-based generation model. Initially, a sketch is transformed into a color illustration via a text-to-image diffusion model with line conditioning, thereby simplifying the shading information extraction. The color illustration then guides the generation of the manga image with screentones through intensity-based conditioning.
  2. High-Quality Dataset Compilation and Model Finetuning: The inadequacy of pre-existing high-quality screentoned data was addressed by compiling a dataset of 289,000 high-resolution manga images, integrating public datasets with restored high-resolution versions. The diffusion models, including both the VAE decoder and U-Net, are finetuned specifically for manga generation, thus adapting to the domain-specific requirements of screentone patterns and shading.
  3. Adaptive Scaling Method: A novel adaptive scaling technique is introduced to integrate high-frequency screentones into color illustrations. This method dynamically adjusts screentone visibility based on the standard deviation within color clusters, ensuring consistency with artistic shading in generated illustrations.

Experimental Validation and Comparative Analysis

The paper provides extensive experimental results demonstrating the superior performance of their method relative to existing alternatives in both sketch-to-manga and illustration-to-manga tasks. The framework outperforms previous methods such as pix2pix and Screentone Synthesis in generating natural and aesthetically consistent manga outputs from sketches.

Figures presented in the paper, such as Figure 2 and the visual comparisons in Figures 4 and 5, highlight the challenges associated with previous approaches and showcase the improved quality afforded by the proposed framework, especially in the context of screentone shading and preservation of original sketch details.

Implications and Future Directions

The framework proposed in this work has substantial implications for the automation of manga creation, offering a method that reduces the manual effort required for screentone application while maintaining high aesthetic standards. The research pushes the boundaries of diffusion model applications beyond general image synthesis, adapting them effectively to niche domains like manga generation.

Future developments might explore enhancements in maintaining consistency across multiple panels in manga pages and extending the framework's applicability to animations by ensuring coherence across frames in line cartoons. Additionally, further refinement of the loss functions and model architectures could enable even finer screentone detail and shading accuracy.

In summary, this paper provides a significant advancement in automated manga generation, leveraging tailored diffusion models and innovative data processing techniques. The proposed Sketch2Manga framework represents a substantial step towards a practical and efficient tool for manga artists and digital illustrators.

Reddit Logo Streamline Icon: https://streamlinehq.com