Structured Garment Morphing

Updated 4 July 2026

Structured garment morphing is a method where deformation and editing are governed by explicit structures such as UV maps, sewing patterns, and panel graphs.
It leverages topology-aware representations and sewing-pattern embeddings to maintain garment semantics and ensure structure preservation during fit adaptation.
These methods enhance garment personalization by addressing issues like ill-fitting clothing, ensuring seam consistency, and enabling simulation-ready, multi-pose garment design.

Searching arXiv for the provided garment-structure and morphing papers to ground the article in current literature. Structured garment morphing can be understood as the family of garment methods in which deformation, fitting, reconstruction, or editing is governed by explicit structure rather than by unconstrained surface change. In the literature, that structure may be the wearer’s pose and movement envelope, a topology-aware UV map, a sewing pattern with panels and stitches, a panel–seam graph, a layered occlusion order, or a category-level correspondence field. Representative formulations include direct 3D rest-shape adaptation across multiple body poses for personalized clothing (Wolff et al., 2021), topology-aware UV-position maps for shape and style editing (Su et al., 2020), sewing-pattern embeddings and UV-position maps with masks (Chen et al., 2022), conditional diffusion over layout-consistent UV textures (Vidaurre et al., 24 Mar 2025), and template-free structured garment specifications that encode panel boundaries, parameterized seams, and explicit stitch topology (Li et al., 23 Jun 2026).

1. Problem setting and motivation

A recurring motivation is that standard sizes do not fit most bodies well. The practical consequences listed in the literature are ill-fitting clothes, high return rates, wasted production, and garments that are uncomfortable in motion. In personalized design, the problem is explicitly framed as one of body diversity in height, proportions, asymmetry, limb length, and body shape, together with the observation that fit is not determined by a single neutral posture. A shirt that appears acceptable in a T-pose may become uncomfortably tight when the wearer raises or lowers the arms, so garments should be designed for the range of poses the wearer actually uses, not only one static scan (Wolff et al., 2021).

The same shift away from static geometry appears in other settings. For manipulated garments, the difficulty is that garments are being manipulated rather than worn, so they can undergo large folding, crumpling, and self-occlusions, and the shape is no longer constrained by a human body (Li et al., 2024). In image-based virtual try-on, a further complication is multi-layer dressing: multi-layer VTON requires realistic deformation and layering of an inner garment and an outer garment, and the central challenge becomes the modeling of occlusion relationships so that redundant inner-garment features do not interfere with generation (Yu et al., 20 Jan 2026).

This suggests that structured garment morphing is not a single algorithmic primitive but a problem class. The shared requirement is that a garment must be adapted while preserving some explicit semantics: freedom of movement, panel structure, seam compatibility, layer order, garment category structure, or simulation validity.

2. Representation families

One major line of work represents garments in UV space while keeping topology explicit. DeepCloth introduces a topology-aware UV-position map together with a UV-mask, so that garment geometry and garment topology are represented jointly. Because a hard binary mask does not interpolate smoothly, the method converts the mask into a continuous field with a bi-distance transform,

$\mathcal{T}(Mask_g)=\mathcal{DT}(Mask_g)-\mathcal{DT}(\mathcal{I}-Mask_g),$

and for skirts and dresses it uses an independent cylindrical UV parameterization rather than forcing them into the SMPL UV layout (Su et al., 2020). DiffusedWrinkles adopts a related but generative formulation: 3D garment deformations are encoded as a 2D displacement map stored as an RGB image on a shared UV layout, and the posed garment is written as

$M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$

In this representation, local wrinkles are generated in UV space and projected back into 3D through the known UV parameterization, making the approach mesh-topology-agnostic while preserving explicit garment geometry (Vidaurre et al., 24 Mar 2025).

A second line of work makes sewing structure the latent space. Neural Sewing Machines represent a garment by a sewing-pattern graph,

$G=(V,E),$

extend it with basic panel groups, and decode the resulting embedding into UV-position maps with masks for individual panels (Chen et al., 2022). GarmageNet likewise treats garments as structured panel collections, defining a garment asset as

$\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$

where each panel is a 4-channel geometry image, together with a 3D bounding box and a normalized 2D scale (Li et al., 2 Apr 2025). PatternGSL pushes this logic further by representing a garment as a hierarchical JSON-like specification with a meta block, a panels array, and a stitches array. It explicitly encodes ordered panel vertices, edge geometry via straight, quadratic Bézier, cubic Bézier, and circular-arc edges, 3D panel placement, and stitch references through (panel_id, edge_index) pairs (Li et al., 23 Jun 2026).

A third line of work reconstructs coupled 2D–3D garment structure. ReWeaver predicts 3D curves, 3D patches, 2D edges, and a binary patch–curve connectivity matrix from sparse multi-view RGB images, thereby producing a structured 2D–3D garment graph rather than an unstructured surface (Li et al., 23 Jan 2026). Inverse Garment and Pattern Modeling with a Differentiable Simulator represents the pattern as a planar mesh $U$ whose shape is controlled by boundary control points, while the sewn 3D garment $X$ is the simulated drape of that pattern around an SMPL body (Yu et al., 2024).

The common implication is that structured garment morphing depends strongly on representation choice. UV fields, panel groups, structured geometry images, graph-based patterns, and explicit stitch lists all provide a decomposition in which deformations remain attributable to meaningful garment entities.

3. Pose, motion, and temporal coherence

In personalized design, pose awareness is explicit. “Designing Personalized Garments with Body Movement” builds a garment directly in 3D on a scanned avatar, without starting from a 2D sewing pattern, and transfers boundaries across registered poses through barycentric coordinates. Its core loop alternates cloth simulation and rest-shape adaptation. The garment has a rest shape $\hat{\mathbf x}_i$ and a simulation mesh $\mathbf x_i$ ; stretch is measured triangle-wise through the singular values of a deformation gradient $\mathbf F$ , and when stretch exceeds the allowable range $[0,1+\delta]$ , the rest shape is updated by clipping singular values and reconstructing a valid surface with as-rigid-as-possible (ARAP) surface modeling. The cloth model is based on Baraff and Witkin, and collision is handled through a signed distance field (Wolff et al., 2021).

Garment4D formulates temporal morphing over point cloud sequences as three stages: sequential garments registration, canonical garment estimation, and posed garment reconstruction. Registration establishes common topology across sequences of the same garment type; canonical estimation predicts a pose-independent garment

$M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$ 0

and posed reconstruction deforms that canonical garment by combining Interpolated Linear Blend Skinning, a Proposal-Guided Hierarchical Feature Network, an Iterative Graph Convolution Network, and a Temporal Transformer. The explicit purpose of the Temporal Transformer is smooth garment motions capture, and the full system is designed to model garment dynamics caused by garment–body interaction, especially for loose garments such as skirts (Hong et al., 2021).

LoBoFit addresses refitting between a source avatar and a target avatar in arbitrary poses by replacing global vertex-space optimization with Local Bone Mapping Blending. A garment vertex is mapped into each bone’s local frame,

$M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$ 1

reconstructed back into global space through $M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$ 2, and then blended across bones,

$M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$ 3

The method uses a pose-robust initialization, optimizes bone-local coordinate residuals and blending weight residuals, and employs contact, preservation, and regularization terms to preserve fit style, silhouette, and especially fine-scale wrinkles (Zhang et al., 8 May 2026).

DiffusedWrinkles adds a generative temporal formulation. A garment state is represented as a UV displacement texture conditioned on body shape $M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$ 4, pose $M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$ 5, and garment design $M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$ 6, and temporal coherence is improved by conditioning the model on the previous frame’s garment texture:

$M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$ 7

This changes the model from framewise synthesis to a stateful process in which the previous garment state acts as an anchor (Vidaurre et al., 24 Mar 2025).

Across these formulations, morphing is not merely a spatial interpolation. It is frequently a controlled update of a rest shape, a canonical template, or a local coordinate field under body motion and temporal constraints.

4. Structure preservation, sewing validity, and inverse design

When the goal is simulation-ready or fabrication-compatible output, preserving structure means preserving sewability, panel integrity, and seam consistency. Neural Sewing Machines make this explicit through three losses: an inner-panel structure-preserving loss $M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$ 8, an inter-panel structure-preserving loss $M_\text{g}(\beta,\theta,\mathbf{p}) = W\!\left(T_\text{g}(\beta,\theta,\mathbf{p}),\, J(\beta),\, \theta,\, \mathcal{W}\right).$ 9, and a surface-normal loss $G=(V,E),$ 0, combined with reconstruction as

$G=(V,E),$ 1

with $G=(V,E),$ 2. These terms preserve within-panel geometry, stitched-edge coincidence, and local surface orientation during reconstruction and manipulation (Chen et al., 2022).

Inverse Garment and Pattern Modeling with a Differentiable Simulator treats garment recovery as an inverse design problem. Starting from a user-selected base garment template $G=(V,E),$ 3, it performs linear grading and then a differentiable optimization over pattern variables $G=(V,E),$ 4 and physical/material variables

$G=(V,E),$ 5

Its loss is

$G=(V,E),$ 6

where $G=(V,E),$ 7 includes a curvature-weighted Chamfer term, an open-contour term, and a material regularizer, and $G=(V,E),$ 8 enforces compatible seam-edge lengths. The method further exploits inter-panel symmetry, intra-panel symmetry, and Mean Value Coordinates (MVC) so that pattern edits preserve mesh structure and manufacturing validity (Yu et al., 2024).

PatternGSL emphasizes deterministic validity rather than optimization-based repair. Its decoder reconstructs geometry from the generated specification, uses boundary samples as a fallback when curve parameters are missing or corrupted, and applies deterministic sanitation rules including merging short collinear edges, removing invalid panels, and validating stitch references. Because the representation exposes panel geometry and stitch topology directly, editing operations such as panel scaling, curve adjustment, component removal, and sleeve spread are performed at the pattern level rather than as generic mesh deformations (Li et al., 23 Jun 2026).

GarmageNet complements these approaches with a stitching module. It extracts panel contours from the alpha channel of each geometry image, resamples them into 3D and UV point sets, fuses PointNet++ features for geometry and UV with point transformer blocks, predicts a matching matrix with Sinkhorn, and obtains discrete correspondences with the Hungarian algorithm. Those point correspondences are then converted into vectorized seam relationships suitable for assembling a simulation-ready garment (Li et al., 2 Apr 2025).

A plausible implication is that “structured” in garment morphing often becomes synonymous with “editable without destroying sewing logic.” This is especially evident when panel boundaries, stitch references, and seam lengths are first-class variables.

5. Diffusion and multimodal conditioning

Diffusion models have been incorporated into garment morphing not only for photorealism but also for structured conditioning. DiffCloth is a Stable Diffusion-based latent diffusion model that addresses two specific garment-generation errors: garment part leakage and attribute confusion. It extracts visual garment parts $G=(V,E),$ 9 by semantic segmentation and textual Attribute-Phrases (APs) $\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$ 0 by constituency parsing, matches parts to APs as a bipartite matching problem with the Hungarian algorithm, and adds a semantic-bundled cross-attention loss so that adjectives and part nouns within an AP attend to similar spatial regions. For editing, it derives blended masks from bundled attention maps and confines denoising updates to the intended region (Zhang et al., 2023).

DiffusedWrinkles uses a conditional diffusion model to learn

$\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$ 1

where $\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$ 2 is a UV displacement image and $\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$ 3. Its training data are simulated with ArcSim across 17 designs and 52 motion sequences from AMASS, each frame being rasterized into a $\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$ 4 UV texture. The model is generative, so for the same $\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$ 5 it can synthesize multiple plausible wrinkle configurations, and with previous-frame conditioning it generates temporally coherent sequences (Vidaurre et al., 24 Mar 2025).

GO-MLVTON treats multi-layer virtual try-on as an exemplar-based image inpainting problem in latent diffusion space. It introduces Garment Occlusion Learning (GOL) to compute an occlusion attention map

$\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$ 6

uses that map to refine the inner-garment latent $\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$ 7, and performs fitting with a StableDiffusion v1.5–based UNet in the Garment Morphing & Fitting (GMF) module. The model is initialized from InstructPix2Pix pretrained weights, removes cross-attention blocks following CATVTON, and uses classifier-free guidance with guidance scale $\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$ 8. The associated metric, Layered Appearance Coherence Difference (LACD), gives extra weight to connecting regions between adjacent garment layers (Yu et al., 20 Jan 2026).

For manipulated garments, “Reconstruction of Manipulated Garment with Guided Deformation Prior” extends Implicit Sewing Patterns (ISP) with a diffusion prior over concatenated UV-position maps and mask maps, maps incomplete point clouds into UV space with sparse 3D convolutions and a transformer encoder, and then performs guided reverse diffusion so that the completed UV map matches observed sparse measurements and a recovered panel mask (Li et al., 2024).

These systems show that diffusion is being used in several structurally distinct ways: as UV-space wrinkle synthesis, as AP-level cross-modal alignment, as occlusion-aware latent inpainting, and as guided completion of structured UV observations.

6. Applications, evaluation, and recurring limitations

The application range is broad. Personalized clothing design uses multi-pose body scans and direct 3D editing on the avatar (Wolff et al., 2021). Single-view reconstruction and controllable manipulation are addressed by Neural Sewing Machines (Chen et al., 2022). DiffCloth supports garment synthesis and manipulation from text prompts (Zhang et al., 2023). GO-MLVTON targets multi-layer virtual try-on (Yu et al., 20 Jan 2026). UniGarmentManip transfers unfolding, folding, and hanging actions across garments by learning dense category-level correspondence and adapting it with one-shot or few-shot demonstrations (Wu et al., 2024). ReWeaver reconstructs topology-aware garments from sparse multi-view images for 3D perception, physical simulation, and robotic manipulation (Li et al., 23 Jan 2026).

Evaluation criteria reflect the same structural diversity. NSM reports Chamfer distance, Point-to-surface Euclidean distance (P2S), and MGLE, with Chamfer: 1.65, P2S: 1.46, and MGLE: 3.54 for reconstruction from sewing patterns, and Chamfer: 2.08, P2S: 1.90, and MGLE: 3.73 for single-view reconstruction on a dataset of about 22,400 samples and 12 base categories (Chen et al., 2022). PatternGSL reports 2D Chamfer distance: 5.78 mm, 2D IoU: 86.34%, Stitch accuracy: 98.48%, Draping success rate: 99.2%, and 3D Chamfer after simulation: 6.31 mm on PatternGSLData, which contains 300K samples (Li et al., 23 Jun 2026). ReWeaver reports corrected values of $\mathcal{G}=\bigl\{(G_i, \mathbf{B}_i, \mathbf{S}_i)\,\, | \,\, i=1,\dots,N\bigr\},$ 9, $U$ 0, $U$ 1, $U$ 2, and IoU $U$ 3, emphasizing topology accuracy and seam–panel consistency (Li et al., 23 Jan 2026). GarmageNet reports lowest MMD: 8.83, best COV: 58.13, best 1-NNA: 33.86, together with stitch metrics including CP $U$ 4, CR $U$ 5, and AMD $U$ 6 (Li et al., 2 Apr 2025).

Several limitations recur. DiffusedWrinkles notes that collisions remain a challenge, long-term dynamics are not modeled, and the design family is constrained by the chosen parametric template (Vidaurre et al., 24 Mar 2025). Inverse differentiable simulation is slow, sensitive to the initial pattern, and assumes the target can be represented within the base template family selected by the user (Yu et al., 2024). NSM explicitly states that pose deformation is not modeled and that very irregular garment panels are not well handled (Chen et al., 2022). PatternGSL reports a topology range of 2 to 37 panels and notes that severe front/back ambiguity remains challenging (Li et al., 23 Jun 2026).

A common misconception is that garment morphing denotes only interpolation between two meshes. The literature shows broader meanings: latent transitions between garment topologies in DeepCloth (Su et al., 2020), rest-shape adaptation under stretch and pose changes (Wolff et al., 2021), structured manipulation of sewing patterns and topology in NSM (Chen et al., 2022), UV-space stochastic deformation synthesis in DiffusedWrinkles (Vidaurre et al., 24 Mar 2025), and correspondence-driven transfer of manipulation intent in UniGarmentManip (Wu et al., 2024). Another common misconception is that visual plausibility alone implies structural usefulness. The repeated emphasis on seam consistency, sewing topology, panel validity, differentiable simulation, and deterministic decoding indicates that simulation-ready garment morphing is a stricter objective than realistic rendering alone (Yu et al., 2024, Li et al., 23 Jan 2026, Li et al., 23 Jun 2026, Li et al., 2 Apr 2025).

Taken together, these works describe a transition from pattern-first, static, standard-size tailoring or unstructured surface prediction toward 3D shape-first, pose-aware, topology-aware, and sewing-aware garment optimization. This suggests that the defining feature of structured garment morphing is not any single deformation model, but the insistence that garment change remain indexed by explicit structure: body movement, UV correspondence, panel layout, seam topology, layer order, or category-level function.