Papers
Topics
Authors
Recent
Search
2000 character limit reached

AnimeHair: 3D Anime Hairstyle Dataset

Updated 4 July 2026
  • AnimeHair is a large-scale dataset of 37K curated 3D anime hairstyles with separated hair cards that enable focused learning in a stylized domain.
  • The dataset utilizes a control-point based parameterization coupled with an autoregressive transformer to sequence hair as a 'hair language,' ensuring compact and invertible representation.
  • Empirical evaluations demonstrate state-of-the-art reconstruction and perceptual metrics, highlighting its effectiveness for both direct mesh editing and conditional generative modeling.

Searching arXiv for the cited papers to ground the article in the latest record. AnimeHair is a large-scale dataset of 37K high-quality anime hairstyles with separated hair cards and processed mesh data, introduced to facilitate both training and evaluation of anime hairstyle generation within the CHARM framework (He et al., 25 Sep 2025). It targets a domain in which anime hairstyle exhibits highly stylized, piecewise-structured geometry that challenges existing techniques, and it is paired with a compact, invertible control-point-based parameterization and an autoregressive generative framework that treats a hairstyle as a sequential “hair language.” In practice, AnimeHair functions both as a curated corpus of decomposed 3D hair geometry and as the empirical basis for learning-based reconstruction and conditional generation.

1. Research setting and motivation

Traditional hair modeling methods focus on realistic hair using strand-based or volumetric representations, whereas anime hairstyle exhibits highly stylized, piecewise-structured geometry. Existing works often rely on dense mesh modeling or hand-crafted spline curves, making them inefficient for editing and unsuitable for scalable learning (He et al., 25 Sep 2025).

Within that setting, AnimeHair addresses two constraints simultaneously. First, it provides a large-scale source of 3D anime-style hairstyles in a form compatible with supervised learning. Second, it aligns dataset design with a representation in which a sequence of control points represents each hair card, rather than treating hair as an undifferentiated dense mesh. This suggests a deliberate coupling between data curation and model architecture: the dataset is not merely a repository of meshes, but a substrate for sequential geometric modeling.

A common misconception is that anime hair can be handled as a minor stylistic variant of realistic hair. The CHARM formulation rejects that assumption by centering hair cards, control-point sequences, and card ordering around the head, rather than realistic strand simulation or volumetric occupancy alone. A plausible implication is that AnimeHair is best understood as a domain-specific dataset whose structure is inseparable from the stylization conventions of anime character design.

2. Dataset construction and statistical profile

AnimeHair consists of 37 000 distinct 3D anime-style hairstyles downloaded from the public VRoid-Hub repository (He et al., 25 Sep 2025). Each original character mesh was normalized to fit in a [0.5,0.5][-0.5,0.5] cube, after which the hair submesh was extracted by material tags. The resulting corpus is described as a dataset of fully decomposed “hair cards” harvested and cleaned from VRoid-Hub.

Preprocessing proceeds through several mesh-cleaning and structural validation stages. Vertex merging and connected-component analysis ensure each hair card is watertight. Endpoints are identified by locating regions where valence patterns change and then tracing each card back to the root. Mesh components that fail to match the repeating-unit template, specifically diamond or triangular pyramids, at 98%\ge 98\% recall are discarded. Outliers are also filtered out, including any hair with width or thickness >0.10> 0.10 or >6000> 6\,000 control points.

The dataset statistics are explicitly tuned for autoregressive sequence modeling. Hairstyles per model range from 25–130 cards. Control points per card range from 20–60. Total control points per hairstyle range from 1 000–6 000, which is stated to be suitable for autoregressive sequences. A random 100-sample held-out test set is reserved, and the remaining 36 900 train the transformer.

The reported distributions further characterize the corpus. Specifically, 51.4 % of cards are “short” (<0.25<0.25 normalized length), 33 % are medium, and 15.6 % are long. The x,zx,z positions cluster near the head center, yy positions concentrate higher up, and width and thickness follow smooth but heavy-tailed distributions. These statistics indicate that AnimeHair is not only large-scale, but also distributionally structured around common anime hairstyle conventions such as concentrated scalp attachment regions and variable card extent.

3. Control-point parameterization and invertibility

The geometric core associated with AnimeHair is a control-point-based parameterization in which each hair card is converted into NN control points (He et al., 25 Sep 2025). At control point ii, the representation stores five floats:

pi=(xi,yi,zi),wi,ti,p_i=(x_i,y_i,z_i), \qquad w_i, \qquad t_i,

where 98%\ge 98\%0 is the 3D position, 98%\ge 98\%1 is the half-width, and 98%\ge 98\%2 is the thickness.

Tangent estimation is performed by a fourth-order finite difference on five neighbors using weights 98%\ge 98\%3:

98%\ge 98\%4

A smooth normal field is then computed by least-squares with inter-point smoothness, and 98%\ge 98\%5 is fixed by PCA to avoid the trivial zero solution. Width and thickness directions are defined as

98%\ge 98\%6

Mesh reconstruction follows directly from these quantities. Given 98%\ge 98\%7, the two diamond bases of each unit are rebuilt as

98%\ge 98\%8

and consecutive units are linked by shared vertices with quad faces to recover the original mesh. Inverse encoding reads each unit’s face centroids for 98%\ge 98\%9 and fits base and height for >0.10> 0.100. The paper states that this five-parameter model compresses the original mesh by >0.10> 0.101.

This representation is characterized as compact and invertible. In practical terms, that means the dataset is not limited to passive storage: it supports direct geometric editing at the level of curve shape, cross-section width, and thickness while remaining compatible with sequence-based learning. A plausible implication is that AnimeHair is simultaneously a dataset format and an implicit interface for hairstyle manipulation.

4. Sequential formulation and generative modeling

Within CHARM, AnimeHair is consumed by an autoregressive transformer that interprets anime hairstyles as a sequential “hair language” (He et al., 25 Sep 2025). Sequence construction begins by sorting cards counterclockwise around the head, looking down >0.10> 0.102, in order to capture inter-card structure. Within each card, control points follow root>0.10> 0.103tip connectivity. The final token stream is

>0.10> 0.104

where >0.10> 0.105 starts, >0.10> 0.106 ends each card, and >0.10> 0.107 ends the hairstyle.

Conditioning is provided by Michelangelo [Zhao et al. ’23], which converts the input 10 000-point surface cloud into a fixed token sequence >0.10> 0.108. A control-point encoder >0.10> 0.109 embeds discrete >6000> 6\,0000 tokens via learnable lookup tables and linearly projects them into hidden states:

>6000> 6\,0001

A decoder-only transformer >6000> 6\,0002 with 6 layers and hidden dimension 768 performs next-token prediction conditioned on >6000> 6\,0003 and past hidden states:

>6000> 6\,0004

Cascaded decoders then predict attributes in the order position, width, thickness:

>6000> 6\,0005

Training minimizes the sum of a cross-entropy over discrete token predictions plus two binary cross-entropies for the >6000> 6\,0006 and >6000> 6\,0007 classifiers:

>6000> 6\,0008

At inference time, specialized heuristics improve robustness. Root Position Verification tests top->6000> 6\,0009 alternatives if a newly predicted root is <0.25<0.250 from the scalp cloud. Length Normalization caps control points at 80 per card via cubic-spline resampling and forcibly emits <0.25<0.251 at 100. These design choices indicate that AnimeHair is organized not just for storage, but for stable variable-length decoding.

5. Empirical evaluation and ablation structure

Evaluation is conducted on the 100 held-out hairstyles, with baselines MeshAnything, MeshAnything V2, BPT, and DeepMesh (He et al., 25 Sep 2025). The reported geometry metrics are Chamfer Distance, Earth Mover’s Distance, Hausdorff Distance, and Voxel IoU using <0.25<0.252 voxels. Perceptual quality is assessed by average CLIP cosine similarity over eight rendered views.

On geometric comparison, CHARM reports <0.25<0.253, <0.25<0.254, <0.25<0.255, and <0.25<0.256. For comparison, MeshAnything V2 reports <0.25<0.257, <0.25<0.258, <0.25<0.259, and x,zx,z0 respectively, while MeshAnything reports x,zx,z1, x,zx,z2, x,zx,z3, and x,zx,z4. BPT reports x,zx,z5, x,zx,z6, x,zx,z7, and x,zx,z8, and DeepMesh reports x,zx,z9, yy0, yy1, and yy2. On the perceptual metric, CHARM reports CLIP yy3 with rank 1; MeshAnything V2 reports yy4 with rank 2; DeepMesh yy5 with rank 3; MeshAnything yy6 with rank 4; and BPT yy7 with rank 5.

The ablations are directly relevant to AnimeHair as a dataset representation. For sequence ordering, counterclockwise ordering yields yy8, yy9, NN0, and NN1, outperforming X-axis sorting, Y-axis sorting, and Z-axis sorting. For parameterization, the five-param method yields the same best values, outperforming the extended vector and explicit vertex alternatives. Hair-Card–Level Metrics are reported to confirm similar trends, showing that individual card shapes are faithfully reconstructed and that the chosen ordering yields the best coherency.

These results support two specific conclusions. First, the dataset’s decomposition into card sequences is empirically consequential rather than merely descriptive. Second, AnimeHair’s control-point format is not only compact, but also associated with the strongest reconstruction and perceptual generation performance among the compared settings.

6. Editing affordances, adjacent tasks, and domain-specific limits

AnimeHair is explicitly framed as artist-friendly and scalable (He et al., 25 Sep 2025). Because each hair card is fully described by NN2 control points with NN3, artists can directly manipulate curve shape NN4, cross-section width NN5, and thickness NN6 without wrestling with thousands of raw vertices or remeshing. The invertible pipeline lets edits roundtrip between control points and full meshes losslessly. On the learning side, the drastic token compression NN7 makes the representation tractable for modern transformers, enabling autoregressive synthesis of complex, variable-length hairstyles.

A useful boundary condition emerges from adjacent work on 2D anime hair processing. “ToonOut: Fine-tuned Background-Removal for Anime Characters” reports that while state-of-the-art background removal models excel at realistic imagery, they frequently underperform in specialized domains such as anime-style content, where complex features like hair and transparency present unique challenges (Muratori et al., 8 Sep 2025). ToonOut therefore collected and annotated a custom dataset of 1,228 high-quality anime images of characters and objects, using gray-scale “alpha” annotation in which intermediate gray values represent partial transparency such as soft hair edges and stray strands. On a 126-image test split, vanilla BiRefNet scores 95.3 % Pixel Accuracy, while fine-tuning yields 99.5 %; Boundary IoU improves from 88.5 % to 95.6 %.

That comparison clarifies a common misunderstanding. AnimeHair is a 3D hairstyle dataset with separated hair cards and processed mesh data, whereas ToonOut addresses 2D foreground-background segmentation and alpha-mask reconstruction. The shared difficulty is hair-specific structure: stylized geometry in the 3D case, and thin strands plus semi-transparent regions in the 2D case. This suggests that anime hair remains a specialized subdomain across both geometry processing and image segmentation, and that domain-specific curation is central in both settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to AnimeHair.