- The paper introduces a novel avatar reconstruction method using body-anchored Gaussians and positive normal offsets to ensure intersection-free, physically plausible garment layering.
- It employs a pipeline that integrates multi-view segmentation lifting, topology-aware label refinement, and independent layer optimization for robust geometric accuracy and semantic control.
- Experimental results on the 4D-DRESS dataset demonstrate state-of-the-art performance, notably reducing garment-body intersections and producing simulation-ready mesh extraction.
DAMA: Physically Plausible Layered Avatars with Disentangled Body-Anchored Gaussians
Introduction
The DAMA framework proposes a novel method for the reconstruction of multi-layered, physically plausible human avatars from multi-view RGB imagery and corresponding segmentation masks. Unlike prior approaches that focus solely on visual fidelity or poseable geometry, DAMA introduces a representation and reconstruction pipeline that enforces geometric and physical plausibility by design—specifically targeting clean garment separation and explicit stacking order. DAMA’s key innovation lies in the anchoring of anisotropic Gaussians to SMPL-X faces via barycentric coordinates and strictly positive normal offsets, enabling intersection-free garments, semantic identity preservation under deformation, and user-defined garment stacking and reordering.
Existing neural avatar models—including volumetric radiance fields and Gaussian splatting avatars—prioritize appearance modeling and generic pose control, while neglecting explicit garment decomposition and hierarchical layering [kerbl3Dgaussians, huang20242dgs]. Some disentanglement works, such as GALA [kim2024gala] and Disco4D [pang2025disco4d], attempt garment isolation via 2D mask lifting and optimization-based intersection penalties. These approaches, however, induce artifacts such as ambiguous segmentation boundaries, interpenetrating layers, and fail to encode layer ordering as an explicit geometric constraint.
Template-driven garment separation [pons2017clothcap, bhatnagar2019multi] achieves multi-layered disentanglement but is restricted by fixed templates and high-quality 3D input requirement. Recent Gaussian Wardrobe [chen2026gaussianwardrobe] permits stacking but locks layer order during training and resolves collisions post hoc in image space, prohibiting garment reordering at inference.
DAMA aims to overcome these limitations via direct geometric encoding: each Gaussian is bound to a mesh face and placed with a strictly positive normal offset, thus enforcing garment separation, surface adherence, and intersection avoidance. Topology-aware label refinement further guarantees robust segmentation, especially in self-occluded regions.
Methodology
Anchored Gaussian Representation
DAMA anchors each Gaussian to a specific SMPL-X face using barycentric coordinates within the local face and a positive normal offset δil​>0, ensuring that all garment layer Gaussians lie exterior to the body and preceding layers. This formulation prevents lateral drift, maintains semantic assignment through articulations, and eliminates interpenetration.
The Gaussian mean for layer l is given by:
μil​=k=1∑3​bikl​vk​+δil​k=1∑3​bikl​nk​
where bikl​ are barycentric coordinates and δil​>0 is the normal offset.
Rotation is determined as a relative quaternion to the SMPL-X face orientation, and scales and opacities are set to capture locally isotropic geometry and visibility across the mesh.
Reconstruction Pipeline
- Segmentation Lifting: Multi-view image masks are lifted to SMPL-X-anchored Gaussians, providing initial geometry and label assignments. Label smoothness, scale, and normal alignment losses are used to stabilize optimization.
- Topology-Aware Refinement: Labels are projected onto the mesh’s topology, and connected label regions below an area threshold are relabeled via majority vote among adjacent faces. This step rectifies occlusion- and segmentation-induced errors.
- Fine Geometry and Appearance Optimization: Each semantic layer is extracted, initialized with multiple Gaussians per face, and optimized independently for geometry and masked appearance supervision. Layer-level losses provide robust regularization and prevent color/geometry leakage.
- Layer Composition and Animation: Layers are merged per user-defined order, with Gaussian means updated according to underlying stacking. The construction naturally supports SMPL-X pose animation by re-evaluating Gaussian parameters per posed mesh.
Applications
- Garment Transfer and Stacking: Garments are transferred between avatars, utilizing underlying SMPL-X correspondence. Collisions are resolved by stacking offsets, and transferred garments are locally refined for appearance consistency.
- Simulation-Ready Mesh Extraction: Gaussian layers yield meshes by averaging Gaussian means incident to each SMPL-X vertex; the mesh topology and ordering ensure intersection-free, physically consistent simulation input.
- Hair Transfer and Layer Reordering: The representation extends to hair and shoes, facilitating layer-wise transfer, stacking, and user-defined reordering.
Quantitative and Qualitative Evaluation
On the 4D-DRESS dataset [wang20244ddress], DAMA achieves state-of-the-art performance in geometric accuracy and physical plausibility:
| Method |
Chamfer Dist. (mm) ↓ |
Penetration Rate (%) ↓ |
Penetration Depth (mm) ↓ |
PSNR ↑ |
LPIPS ↓ |
| GALA |
28.78 |
36.48 |
28.43 |
32.43 |
0.025 |
| Disco4D |
28.86 |
49.08 |
18.20 |
32.81 |
0.040 |
| 2DGS |
21.88 |
51.83 |
8.91 |
35.21 |
0.031 |
| DAMA |
19.88 |
1.46 |
0.32 |
30.15 |
0.035 |
The representation imposes near-zero garment–body intersections in both full avatars and individual garment layers (upper, lower, outer), outperforming prior work by more than an order of magnitude. Segmentation and topology refinement prevent floating or mislabeled geometry. Appearance metrics (PSNR, LPIPS) remain competitive given the strict geometric constraints imposed.
Ablation studies demonstrate that positive normal offsets are essential for intersection-free layering, and topology-based label refinement is vital for segmentation accuracy. Loss ablations indicate the importance of isotropic scale and canonical alignment for robust geometry in occluded regions and under animation.
Implications and Future Directions
Practically, DAMA enables careful control over garment manipulation—including stacking, transfer, and conversion to simulation-ready geometry—all with explicit layer ordering. Theoretically, the parameterization advances the geometric encoding of avatars, improving physical plausibility and semantic stability under arbitrary deformations.
For generative models and virtual try-on scenarios, DAMA’s explicit stacking and garment ordering address the lack of compositional control seen in diffusion-based or radiance field avatars. This representation could integrate with generative pipelines to produce physically consistent avatars from text or monocular images.
Future research should aim to handle loose or non-conforming garments, incorporate temporal garment deformation from video sequences, and address layer-wise geometry for dynamic simulation, potentially integrating explicit physics in the reconstruction loop.
Conclusion
DAMA establishes a new paradigm for physically plausible, controllable, multi-layered human avatars via SMPL-X-anchored Gaussian representations. The method achieves clean garment disentanglement, explicit stacking, and intersection-free layering, validated by strong quantitative evidence and robust applications. This approach unlocks new avenues in avatar reconstruction, simulation, and controllable generative modeling for virtual environments and digital apparel workflows (2605.21001).