Towards the Automatic Anime Characters Creation with Generative Adversarial Networks (1708.05509v1)

Published 18 Aug 2017 in cs.CV

Abstract: Automatic generation of facial images has been well studied after the Generative Adversarial Network (GAN) came out. There exists some attempts applying the GAN model to the problem of generating facial images of anime characters, but none of the existing work gives a promising result. In this work, we explore the training of GAN models specialized on an anime facial image dataset. We address the issue from both the data and the model aspect, by collecting a more clean, well-suited dataset and leverage proper, empirical application of DRAGAN. With quantitative analysis and case studies we demonstrate that our efforts lead to a stable and high-quality model. Moreover, to assist people with anime character design, we build a website (http://make.girls.moe) with our pre-trained model available online, which makes the model easily accessible to general public.

Citations (184)

View on Semantic Scholar

Summary

The paper introduces a novel GAN-based model leveraging SRResNet and DRAGAN for stable anime character generation.
The research emphasizes high-quality dataset curation and tag estimation to overcome anime image diversity challenges.
Empirical results using FID scores and attribute precision validate improved generation quality and pinpoint areas for future enhancement.

Towards the Automatic Anime Characters Creation with Generative Adversarial Networks

The paper "Towards the Automatic Anime Characters Creation with Generative Adversarial Networks" investigates the application of Generative Adversarial Networks (GANs) in the domain of anime character generation, an area with unique challenges compared to realistic image generation. The authors focus on improving the quality and stability of generated anime facial images by extensively exploring both data preparation and model optimization.

Data Preparation and Tag Estimation

A significant portion of the research deals with the preparation of a high-quality dataset. Unlike conventional image datasets, anime datasets suffer from high variance in style, quality, and noise, which undermines the training process. To address this, the authors curated a consistent and clean dataset from Getchu, enabling effective model training. Moreover, they employed the Illustration2Vec tool to estimate tags for images to provide a categorical basis for conditional generation, addressing the lack of intrinsic metadata in anime image datasets.

Model Architecture and Training

The paper implements a DRAGAN (Deep Regret Analytic GAN) framework, known for stable training and lower computational costs than other GAN variants like Wasserstein GANs. They propose a novel model architecture based on SRResNet to enhance generation performance, focusing on integrating both noise and condition vectors into the image generation process. Their approach also includes using a multi-label classifier within the discriminator to improve label output precision, inspired by the ACGAN model.

Results and Evaluation

The results demonstrated through the paper show a clear improvement over previous models and attempts. The use of quantitative metrics, such as attribute precision tables and FID scores adapted to the anime domain using Illustration2Vec, provides empirical evidence of the model's capability. These results highlight the effectiveness of carefully prepared datasets and optimized GAN architectures in generating plausible anime characters. The models specifically perform well when generating certain attributes like color, though challenges remain with complex attributes like "glasses" and "hat."

Practical Applications and Future Work

One practical contribution of the research project is the online platform they developed, enabling users to interact with the model to generate anime characters on demand. While this proves the accessibility and potential real-world application of their work, the authors acknowledge areas for improvement. Specifically, they identify the imbalance in training data for certain attributes and the challenge of generating high-resolution images effectively.

The paper opens several avenues for future research. Suggested directions include the refinement of GAN models to handle imbalanced data distributions better and the exploration of super-resolution models that can maintain image quality without introducing artifacts. These future efforts could further bridge the gap between amateur and professional culture creation, bolstering fields like animation and game design.

In summary, this paper presents a comprehensive exploration into the domain of anime image generation with GANs. It showcases the importance of tailored datasets and appropriate model choices to address domain-specific challenges. The insights and methodologies proposed could stimulate further advancements and refinements in the generative modeling landscape for cultural and entertainment applications.

PDF Markdown

Related Papers

YouTube

Show All Videos