Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models (2403.08266v1)

Published 13 Mar 2024 in cs.CV and cs.GR

Abstract: While manga is a popular entertainment form, creating manga is tedious, especially adding screentones to the created sketch, namely manga screening. Unfortunately, there is no existing method that tailors for automatic manga screening, probably due to the difficulty of generating high-quality shaded high-frequency screentones. The classic manga screening approaches generally require user input to provide screentone exemplars or a reference manga image. The recent deep learning models enables the automatic generation by learning from a large-scale dataset. However, the state-of-the-art models still fail to generate high-quality shaded screentones due to the lack of a tailored model and high-quality manga training data. In this paper, we propose a novel sketch-to-manga framework that first generates a color illustration from the sketch and then generates a screentoned manga based on the intensity guidance. Our method significantly outperforms existing methods in generating high-quality manga with shaded high-frequency screentones.

References (21)

Summary

The paper presents a two-step diffusion model that first converts sketches into color illustrations before applying intensity-guided screentones.
The framework leverages a high-quality dataset of 289,000 manga images and finetuned diffusion models to accurately capture domain-specific screentone patterns.
Experimental results demonstrate that the approach significantly outperforms methods like pix2pix and Screentone Synthesis in generating aesthetically consistent manga outputs.

Overview of "Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models"

The paper "Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models" addresses the intricate challenge of automating the manga screening process, specifically the transformation of sketches into manga with high-quality shaded screentones. The authors propose an innovative sketch-to-manga framework utilizing a two-step diffusion model that significantly surpasses existing methodologies in producing visually appealing manga illustrations with shaded high-frequency screentones.

Technical Contributions

The authors introduce a two-step framework that first generates a color illustration from a sketch, serving as an intermediary stage, and subsequently produces a screentoned manga guided by intensity. Key contributions of the paper include:

Diffusion-Based Framework: The proposed method is grounded on a diffusion-based generation model. Initially, a sketch is transformed into a color illustration via a text-to-image diffusion model with line conditioning, thereby simplifying the shading information extraction. The color illustration then guides the generation of the manga image with screentones through intensity-based conditioning.
High-Quality Dataset Compilation and Model Finetuning: The inadequacy of pre-existing high-quality screentoned data was addressed by compiling a dataset of 289,000 high-resolution manga images, integrating public datasets with restored high-resolution versions. The diffusion models, including both the VAE decoder and U-Net, are finetuned specifically for manga generation, thus adapting to the domain-specific requirements of screentone patterns and shading.
Adaptive Scaling Method: A novel adaptive scaling technique is introduced to integrate high-frequency screentones into color illustrations. This method dynamically adjusts screentone visibility based on the standard deviation within color clusters, ensuring consistency with artistic shading in generated illustrations.

Experimental Validation and Comparative Analysis

The paper provides extensive experimental results demonstrating the superior performance of their method relative to existing alternatives in both sketch-to-manga and illustration-to-manga tasks. The framework outperforms previous methods such as pix2pix and Screentone Synthesis in generating natural and aesthetically consistent manga outputs from sketches.

Figures presented in the paper, such as Figure 2 and the visual comparisons in Figures 4 and 5, highlight the challenges associated with previous approaches and showcase the improved quality afforded by the proposed framework, especially in the context of screentone shading and preservation of original sketch details.

Implications and Future Directions

The framework proposed in this work has substantial implications for the automation of manga creation, offering a method that reduces the manual effort required for screentone application while maintaining high aesthetic standards. The research pushes the boundaries of diffusion model applications beyond general image synthesis, adapting them effectively to niche domains like manga generation.

Future developments might explore enhancements in maintaining consistency across multiple panels in manga pages and extending the framework's applicability to animations by ensuring coherence across frames in line cartoons. Additionally, further refinement of the loss functions and model architectures could enable even finer screentone detail and shading accuracy.

In summary, this paper provides a significant advancement in automated manga generation, leveraging tailored diffusion models and innovative data processing techniques. The proposed Sketch2Manga framework represents a substantial step towards a practical and efficient tool for manga artists and digital illustrators.

PDF Markdown

Related Papers

Tweets

https://twitter.com/EsotericCofe/status/1779954426502906086

https://twitter.com/ljsabc/status/1768286067764175331

Reddit

"Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models", Lin et al 2024 (5 points, 1 comment)