Anny-One Synthetic Human Mesh Dataset

Updated 7 November 2025

The Anny-One dataset is a fully synthetic, anthropometrically calibrated collection that provides continuous phenotype parameters and high-quality 3D mesh annotations for diverse human modeling.
It leverages detailed attributes like age, gender, height, and muscle mass to generate rigorously calibrated 3D meshes, ensuring realistic and interpretable representations.
Its open Apache 2.0 licensing and comprehensive semantic annotations support robust training for human mesh recovery, avatar creation, and other vision applications.

The Anny-One dataset constitutes a large-scale, fully synthetic, and anthropometrically calibrated dataset for Human Mesh Recovery (HMR). It is generated using the open, scan-free Anny parametric human body model, which leverages procedural knowledge and population statistics to represent a globally diverse range of body shapes, ages, genders, and body proportions within a continuous and interpretable phenotype space. This resource addresses challenges in demographic representation, model interpretability, openness, and data scale present in traditional mesh datasets based on restrictive or proprietary 3D scan corpora.

1. Generation Pipeline and Model Foundations

Anny-One's samples are procedurally generated from the Anny body model, which itself is built upon MakeHuman community assets and is entirely scan-free. The Anny mesh template comprises 13,380 vertices and 13,378 quadrilateral faces (not including tongue/eyes), with an articulated skeleton of 163 bones compatible with standard animation pipelines such as Mixamo. Each synthetic human is specified by continuous, interpretable phenotype parameters—age, gender, height, weight, muscle mass, and a variety of local morphological attributes (e.g., head fat, pregnancy, limb proportion)—each parameter defined in $[0,1]$ and controlling morph targets ("blendshapes").

The data generation pipeline instantiates each synthetic human with paired Anny mesh and HumGen3D rigged character to ensure clothing and accessory diversity. Individuals are situated in photorealistic, procedural indoor environments (Infinigen Indoors) in groups (mean $\sim$ 5 per scene), with poses sourced from AMASS MoCap and GRAB hand pose archives. Up to 40 camera placements per scene (FOV $30^\circ$ to $130^\circ$ ) guarantee human-centric, hand-centric, and variable-perspective imagery.

2. Dataset Scope, Diversity, and Phenotypic Calibration

The final dataset consists of 800,000 high-resolution (1280×1280 px) synthetic images, each accompanied by precise 3D mesh and semantic annotations. Anny-One's chief distinction lies in its continuous demographic and physical span:

Ages: From infants to elders (full lifespan coverage).
Body types: Spanning the global range for stature (children to 2.4m adults), weight, and muscularity.
Global population representativeness: The distributions for each phenotype parameter are calibrated to the latest WHO growth and BMI statistics; samples are drawn as

$p \sim \mathrm{Beta}(\alpha, \beta)$

with calibration of $\alpha$ and $\beta$ conditional on intended (real-world) age/gender means and variances. This process ensures tight alignment of synthetic height-for-age and BMI-for-age curves to empirically measured charts, as visualized in the paper's Figure 1.

Gender: Realized as a continuous parameter in a unified morph space.

Procedural mesh self-collision detection and physically-plausible placement rules prevent interpenetration and anatomical artifacts, while the rendered environments, pose variations, and apparel sampling from HumGen3D yield a broad and realistic range of visual appearances.

3. Data Structure, Annotations, and Technical Interoperability

Each entry in Anny-One includes:

The rendered image (Blender, 1280×1280 px).
Complete Anny parameter vector (continuous phenotype encoding).
3D mesh vertex positions and optional linear maps to SMPL-X/HumGen3D templates.
2D/3D joint locations, pose parameters, segmentation masks, and accessory metadata (scene, camera intrinsics/extrinsics, hand/face keypoints).

Cross-model interoperability is enabled through mappings to SMPL-X and HumGen3D. For the Anny–SMPL-X correspondence, linear regression with barycentric basis initialization followed by mesh distance optimization yields a mean cyclical error of 3.2 mm.

Primary applications include:

HMR model training and evaluation for both single- and multi-person scenarios across all ages, including children and elderly.
Secondary uses: pose estimation supervision, generic scan fitting, synthetic population creation, and high-level avatar generation.

4. Comparative Analysis with Existing Mesh Datasets

Dataset	Openness	Demographic Span	Scale (Images)	Control & Annotation
Anny-One	Apache 2.0 (fully open)	Infants–seniors, continuous genders, global population-calibrated	800,000	Interpretable phenotypes, SMPL-X mapping, no scan dependency
SMPL(-X)/GHUM	Restricted/Proprietary	Western adults, binary gender splits	$<$ 100,000	PCA/latent shape control, scan dependency
CAESAR/AGORA	Proprietary/Restricted	Biased, mostly adult, little age diversity	$<$ 10,000–100,000	Limited, privacy constraints

Unlike traditional scan-based models, Anny-One is unconstrained by privacy or licensing, allowing commercial and research deployment. It offers far broader phenotypic and pose diversity, continuous semantic control, and rigorous real-world statistical calibration. In contrast, established datasets reflect an overrepresentation of Western adult morphologies and are typically inaccessible for open research.

5. Human Mesh Recovery Utility and Benchmarks

HMR models such as HMR2.0 and Multi-HMR, trained with Anny-One, demonstrate performance parity or superiority relative to models trained on scan-derived datasets, especially on evaluation suites with extended shape/age coverage (for example, AGORA and CMU-Toddler benchmarks, with performance detailed in Tables 2 and 3 of the originating paper). Direct semantic control over age, gender, stature, and muscularity during both data synthesis and model supervision enables improved generalization to underrepresented demographics, which are otherwise poorly modeled by PCA/latent-based approaches. Anny-One's physically-valid procedural generation and mesh annotation eliminate pseudo-fitting artifacts and enable millimeter-accurate ground-truth mesh supervision.

6. Technical Rigor, Licensing, and Extensibility

All assets, data, and generation code are distributed under the Apache 2.0 license, ensuring reuse and extensibility. The dataset design is grounded in established practices regarding procedural anthropometric modeling, statistical calibration, mesh annotation, and interoperability—facilitating reproducibility and integration in downstream pipelines. The use of MakeHuman procedural assets and open demographic statistics distinguishes Anny-One from scan-driven, privacy-constrained, or proprietary datasets, and further permits unencumbered data scaling.

7. Significance and Prospective Applications

Anny-One enables both academic and commercial research in human mesh recovery, 3D avatar creation, synthetic population simulation, and human-centric vision, regardless of deployment domain or age/gender specificity. Its semantic phenotype parameterization supports interpretable, controlled population variation, and facilitates downstream tasks requiring broad demographic generalization, such as medical imaging, ergonomic design, and inclusive computing applications.

A plausible implication is that the combination of semantic control, global demographic coverage, and licensing freedom in Anny-One can lower the barriers to developing equitable, transparent, and robust downstream vision-and-graphics systems. In summary, Anny-One represents a step-change in public access, demographic inclusivity, and practical annotation for large-scale 3D human mesh datasets, establishing a foundation for more representative and interpretable modeling in human-centric computing.

PDF Markdown Chat (Pro)

Follow Topic

Get notified by email when new papers are published related to Anny-One Dataset.