Anny Body Model: Differentiable 3D Mesh

Updated 9 November 2025

Anny Body Model is a fully differentiable, parametric 3D human mesh model that uses explicit, interpretable phenotype vectors based on WHO statistics.
It employs piecewise-multilinear blendshape interpolation with differentiable PyTorch layers and dual-quaternion skinning for efficient mesh recovery and scan fitting.
Applications include human mesh recovery, AR/VR avatar generation, and population modeling, with compatibility to models like SMPL-X through sparse linear regression.

Anny Body Model refers to Anny—a fully differentiable, parametric, and scan-free 3D human body model designed to offer interpretable, demographically explicit, and reproducible mesh representations across the human lifespan. Its core innovation is the use of direct anthropometric control, calibrated on World Health Organization (WHO) statistics, and implemented via interpretable blendshapes derived from the MakeHuman open-source asset library. Anny provides precise, semantically driven modulation of human mesh geometry via phenotype vectors, addressing limitations of prior scan-trained models (proprietary data, narrow demographics, opaque latent codes) and establishing a new open standard for generative, population-realistic, and easily fitted 3D body modeling (Brégier et al., 5 Nov 2025).

1. Model Philosophy and Phenotypic Representation

Anny replaces low-dimensional, statistically learned shape spaces (e.g., from SMPL, STAR, GHUM) with a vector of meaningful phenotype variables, $\varphi$ , where each element $\varphi_p \in [0,1]$ corresponds to a human trait—such as age, gender, height, weight, BMI, muscle mass, upper leg length, head fat, or pregnancy. These variables interpolate between artist-defined mesh blendshapes from MakeHuman. Unlike PCA shape spaces, Anny’s $\varphi$ vector is interpretable and fully exposed. The mapping from parameter values to geometry is deterministic, monotonic, and supports interpolation across and between demographic groups (infant to elder, male/female, extreme short/tall, pregnant, etc.).

All deformations (blendshape, skinning, kinematics) are implemented as differentiable PyTorch layers, incorporating NVIDIA Warp for efficient forward-kinematics and dual-quaternion blend-skinning. This enables standard autodiff optimization pipelines, supporting both mesh recovery from images and gradient-based point-cloud fitting.

2. Mathematical Model and Statistical Calibration

The Anny mesh is formulated as follows:

Let $V^0 \in \mathbb{R}^{3 \times N}$ be the template mesh ( $N = 13{,}380$ vertices). For each of $M$ blendshape axes, MakeHuman artists define a displacement field $\Delta V_i \in \mathbb{R}^{3 \times N}$ . The rest-pose mesh for phenotype $\varphi$ is

$V(\varphi) = V^0 + \sum_{i=1}^M w_i(\varphi)\,\Delta V_i$

where $w_i(\varphi)$ is a piecewise-multilinear weighting function of the phenotype vector, yielding continuous, semantically graded transitions between archetypal shapes.

Posing is applied as: $V_p(\varphi, \theta) = \mathrm{Skin}(\mathrm{Kin}(\theta), V(\varphi))$ where $\theta$ is an arbitrary set of skeletal poses (163-bone skeleton), $\mathrm{Kin}$ encodes the forward kinematics, and $\mathrm{Skin}$ applies dual-quaternion blend-skinning.

For demographic representativity, Anny fits conditional Beta distributions ( $\mathrm{Beta}(\alpha_p(a,g), \beta_p(a,g))$ ) for each phenotype parameter $p$ , conditioned on chronological age $a$ and gender $g$ , so that samples drawn from these distributions match the empirical moments (mean, standard deviation) of height, weight, and BMI reported in WHO growth charts.

When used as a scan-fitting prior or generative model, regularization terms penalize deviation from the Beta priors (e.g., negative log-likelihood) and ensure mesh smoothness (e.g., Laplace regularization), or collision constraints.

3. Parameter Mapping, Calibration, and Scan Fitting

Mappings between real-world traits and Anny’s phenotype parameters are explicit and, in key cases, bijective:

$\varphi_\mathrm{age}$ is mapped to chronological age by a one-to-one lookup (e.g., $\varphi_\mathrm{age} = 0 \rightarrow 0$ years, $\varphi_\mathrm{age} = 1 \rightarrow 90$ years).
Conditional Beta parameter fitting ensures that a sampled $\varphi_\mathrm{height}$ or $\varphi_\mathrm{weight}$ (given age, gender) produces mesh heights and weights matching those expected by WHO standards.

Scan fitting (registration) proceeds by minimizing the energy: $\min_{\varphi,\theta} \sum_j \|P_j - \Pi(V_p(\varphi,\theta))\|^2 + \lambda_\mathrm{shape} R(\varphi) + \lambda_\mathrm{pose} R(\theta)$ where $P_j$ are scan points, $\Pi$ denotes the closest-point operator, and $R(\varphi), R(\theta)$ are priors (including the Beta calibration for phenotype variables and standard pose regularizers). Gradient descent, with back-propagation through the fully differentiable pipeline, solves for best fit.

4. Implementation Architecture and Interoperability

The Anny model leverages:

A lightweight mesh (13,380 vertices, 163 bones), computationally and memory efficient yet sufficient for both fine and gross shape deformations.
Piecewise multilinear blendshape interpolation functions, ensuring differentiable and smooth geometric transitions.
Forward kinematics and dual-quaternion skinning for articulated deformation.
BVH-based face self-collision detection during fitting.
Efficient mapping to and from other prominent body models is supported. Sparse linear regressors enable Anny parameter vectors to be translated into SMPL-X or HumGen3D parameters, with a mean cyclic error of ~3 mm, ensuring compatibility with widespread morphable model ecosystems.
All code and models are released under the Apache 2.0 license.

5. Synthetic Population Generation and Human Mesh Recovery

The Anny-One synthetic human corpus comprises 800,000 photorealistic human meshes by:

Sampling $\varphi$ from population-calibrated Beta distributions (mirroring natural demographic diversity).
Sampling body and hand poses from the AMASS and GRAB datasets.
Translating mesh state into HumGen3D for clothing simulation.
Procedurally placing multiple characters per scene in Infinigen-Indoors environments.
Rendering in Blender with randomized, realistic intrinsics (FOV 30°–130°, resolutions 1280 $^2$ ), including camera placements for egocentric and hand-closeup perspectives.

This resource enables training of HMR (Human Mesh Recovery) networks (HMR2.0, Multi-HMR) modified to output the Anny $\varphi$ and $\theta$ vectors directly.

6. Quantitative and Qualitative Experimental Results

Model-agnostic HMR benchmarks (HMR2.0+Anny vs. HMR2.0+SMPL-X) on 3DPW and EHF datasets show near-identical Mean Per Joint Position Error (MPJPE) and Per-Vertex-Error (PVE), with Anny outperforming in PA-MPJPE (49.4 mm vs. 52.0 mm for SMPL-X) despite no scan-based learning.

On AGORA (adults and children), pre-training on Anny-One and finetuning yields the best PA-MPJPE across population splits, with not only competitive accuracy for adults but also state-of-the-art results for children (overall 48.2 mm, 41.5 mm for kids).

For multi-person HMR, Multi-HMR with Anny achieves 41.8 mm PA-MPJPE on 3DPW—improving on AiOS (45.0 mm) and baseline Multi-HMR (46.9 mm)—and delivers consistent performance and shape plausibility across further benchmarks (EMDB, Hi4D, CMU-Toddler).

Qualitatively, Anny demonstrates robust age/shape adaptation across diverse scenes, accurate child/adult distinction in composite environments, and stable pose recovery even under significant visual or self-occlusion.

7. Strengths, Limitations, and Applications

Anny offers:

Full demographic span (infant to elderly, male/female, pregnancy) with interpretable, semantically meaningful phenotype control.
Population realism derived from open anthropometric statistics; no privacy or data-rights complications from 3D-scan dependency.
High-fidelity scan fitting (2.4 mm error on adults, 2.7 mm on children) and HMR generalization matching or exceeding proprietary scan-learned models.
Large-scale, demographically and pose-diverse synthetic human data (Anny-One) for computer vision and graphics applications.

Limitations:

Anny’s phenotype-to-geometry mapping is derived from artist stereotypes (MakeHuman); it lacks non-anthropometric shape variation and is not a statistical model of identity.
No explicit garment, clothing, or dynamic soft-tissue handling beyond blendshape-based attribute encoding.
Demographic diversity reflects MakeHuman coverage; does not exhaustively represent ethnic, ability, or extreme morphologies.

Applications span human mesh recovery (single/multi-person, image/video), precise scan registration for virtual try-on or telepresence, controlled graphics and AR/VR avatar generation, as well as population modeling in anthropological and ergonomic research anchored to reproducible global standards.

In summary, the Anny Body Model establishes a new baseline for fully open, interpretable, demographically calibrated, and differentiable human mesh modeling, providing a practical foundation for scientific, graphics, and perception pipelines that require precise, controllable human geometry over the lifespan (Brégier et al., 5 Nov 2025).

PDF Markdown Chat (Pro)

References (1)

Human Mesh Modeling for Anny Body (2025)

Follow Topic

Get notified by email when new papers are published related to Anny Body Model.