Dress Code Dataset for AI Fashion

Updated 15 September 2025

Dress code dataset is a curated collection of annotated photographic and 3D garment images designed for multi-level classification and simulation tasks.
It provides hierarchical labels, semantic maps, and detailed sewing patterns to aid in virtual try-on, digital garment generation, and precise fashion retrieval.
Researchers use this resource to benchmark deep learning models, from CNN-based virtual try-on pipelines to 3D generative garment simulations, enhancing AI fashion analysis.

A dress code dataset is a curated collection of annotated data containing information about clothing items—particularly dresses and related garments—intended to support research in computer vision, fashion informatics, virtual try-on, digital garment generation, and hierarchical categorization. Such datasets provide high-resolution images, garment attributes, 3D scans, sewing patterns, and semantic labels across multiple levels (category, sub-category, property), and have become central benchmarks for both academic and industrial progress in AI-driven fashion analysis.

1. Dataset Structures: Modalities and Annotation Schemes

Dress code datasets are structured diversely to facilitate multi-level classification, retrieval, try-on, and simulation tasks. Notable examples include multi-level annotated image datasets (Ferreira et al., 2018), paired product-model image datasets for virtual try-on (Morelli et al., 2022), 3D/4D scan datasets with mesh and garment annotations (Ma et al., 2019, Wang et al., 29 Apr 2024), and synthetic datasets featuring sewing patterns and material information (Korosteleva et al., 27 May 2024).

Key annotation schemes:

Hierarchical Labels: Category (e.g., dresses, shirts), sub-category (e.g., cocktail dress, day dress), and attributes (e.g., color, pattern); typically involving multi-label outputs and class balancing issues.
Paired Representations: Each garment is represented by a catalog image and a corresponding model-worn image, enabling direct supervised learning for virtual try-on.
3D and 4D Geometry: Textured meshes, pose registrations, and garment displacement fields that capture complex clothing deformation under motion (Ma et al., 2019, Wang et al., 29 Apr 2024).
Sewing Patterns and Material Properties: Parametric panels, body measures, and textile simulations that bridge 2D pattern design with 3D draped shapes (Korosteleva et al., 27 May 2024).
Semantic Parsing and Segmentation Maps: Detailed pixelwise or vertexwise labels for regions and garment types, with template-free parsing pipelines (Wang et al., 29 Apr 2024).

2. Data Collection, Preprocessing, and Annotation Pipelines

Data acquisition combines large-scale scraping from platforms (e.g., Farfetch (Ferreira et al., 2018), Pinterest (Li et al., 2020)), high-end multi-camera capture setups (Wang et al., 29 Apr 2024), and synthetic generation with simulation pipelines (Korosteleva et al., 27 May 2024).

Preprocessing routines:

Image Standardization: Resizing to fixed inputs (e.g., 224×224, 1024×768), filtering for clean backgrounds, monocromaticity checks, and removal of non-clothing objects (Li et al., 2020).
Augmentation: Flipping, cropping, rotation for regularization (Ferreira et al., 2018).
Multi-View and Temporal Alignment: Rendering captured 3D/4D scans into multi-angle images; using deep human parsers, optical flow, and graph cut optimization for temporal and spatial label consistency (Wang et al., 29 Apr 2024).

Annotation strategies:

Manual and Semi-Automatic Labeling: Combination of crowdsourcing, human-in-the-loop corrections, and automated semantic segmentation pipelines (e.g., Graphonomy, SAM) (Wang et al., 29 Apr 2024).
Hierarchical Tagging: Explicit construction of category trees for multi-level information propagation (Ferreira et al., 2018).
Embedding Sewing Patterns: Automatic measurement extraction, curve projection, sampling strategies for parametric pattern design (Korosteleva et al., 27 May 2024).

3. Technical Architectures and Benchmarks

Several deep learning models leverage dress code datasets as central benchmarks:

Unified Hierarchical CNNs with Message Propagation: Architectures using ResNet-50 backbones with multi-branch dense layers passing enriched latent representations through bidirectional message passing blocks to inject hierarchical structure (Ferreira et al., 2018).
Virtual Try-On Pipelines: Warping modules (TPS-based), semantic-aware discriminators (pixel-level cross-entropy), and U-Net decoders for detailed garment transfer while enforcing semantic parsing (Morelli et al., 2022).
3D Generative Models: Mesh-VAE-GANs extending body models like SMPL, supporting conditional generative deformation over pose and garment type, patchwise discriminators for wrinkle detail (Ma et al., 2019).
Siamese and Retrieval Networks: Inception-ResNet v1 adapted for clothing, trained with triplet and center loss for embedding fine-grained differences, facilitating robust street-to-shop garment matching (Khaund et al., 2019).
Physics-Based Garment Simulation: XPBD draping pipelines with robust collision resolution, constraint enforcement, and synthetic material parameterization (Korosteleva et al., 27 May 2024).

Benchmark metrics include precision@k, recall, F1-score, FID, SSIM, KID, CLIP-SIM, and quantitative Chamfer Distance and stretching energy for simulation and reconstruction (Ferreira et al., 2018, Li et al., 2020, Morelli et al., 2022, Wang et al., 29 Apr 2024, Korosteleva et al., 27 May 2024).

Dataset	Modality	Annotation Levels
Dress Code (Morelli et al., 2022)	Paired images (HD)	Category, segmentation
4D-DRESS (Wang et al., 29 Apr 2024)	4D textured scans	Garment mesh, semantic
GarmentCodeData (Korosteleva et al., 27 May 2024)	3D mesh + patterns	Sewing pattern, material
CAPE (Ma et al., 2019)	3D scan frames	Mesh, offset deformation
FLORA (Deshmukh et al., 21 Nov 2024)	Image-text pairs	Textual style features

4. Applications and Industrial Impact

Dress code datasets have enabled advancements across several domains:

Automated Fashion Categorization: Joint learning of category, sub-category, and attribute relations enhances tag consistency, supports large-scale e-commerce indexing and recommendation (Ferreira et al., 2018).
Virtual Try-On Systems: Data diversity and resolution are instrumental in improving image realism and garment placement when transferring apparel to target models (Morelli et al., 2022).
Synthetic Fashion Generation: 3D scan and garment pattern datasets drive generative modeling for pose-dependant clothing detail and new design exploration (Ma et al., 2019, Korosteleva et al., 27 May 2024).
Garment Matching and Retrieval: End-to-end systems utilize generated garment images from street photos to retrieve and recommend products with high precision (Khaund et al., 2019).
Style Compatibility Engines: Outfit-based datasets underpin algorithms recommending complementary items to complete an ensemble, as deployed at Pinterest (Complete The Look) (Li et al., 2020).
Design Transfer and Personalization: Fitting sewing patterns to custom body shapes allows virtual tailoring and personalized design retargeting (Korosteleva et al., 27 May 2024).

5. Limitations, Challenges, and Future Directions

Notable limitations exist:

Domain Gap: Synthetic datasets rarely fully replicate real-world clothing dynamics and material complexity, necessitating real-world capture and annotation (Wang et al., 29 Apr 2024).
Garment Topology Constraints: Mesh-based generative models struggle to represent garments with non-body-conforming topology (e.g., skirt, open jacket) (Ma et al., 2019).
Data Imbalance and Scalability: Large-scale image datasets exhibit category imbalance and require continual data refreshing and cleaning pipelines (Ferreira et al., 2018, Li et al., 2020).
Annotation Cost: Manual correction is still required for edge cases in multi-view semantic parsing, but semi-automatic approaches mitigate most of the workload (Wang et al., 29 Apr 2024).

Emerging research trends include:

Pixel-level Semantic Discriminators: Fine-grained adversarial supervision proves effective for preserving garment and boundary detail (Morelli et al., 2022).
KAN Adapters and Text-to-Fashion Datasets: Architectures leveraging Kolmogorov-Arnold Networks for adaptation, together with datasets pairing fashion sketches and high-fidelity textual descriptions (FLORA), facilitate nuanced generative fashion design (Deshmukh et al., 21 Nov 2024).

6. Access, Open Source, and Community Resources

Several dress code datasets, models, and annotation pipelines are released openly, including Dress Code (Morelli et al., 2022), 4D-DRESS (Wang et al., 29 Apr 2024), GarmentCodeData (Korosteleva et al., 27 May 2024), FLORA (Deshmukh et al., 21 Nov 2024), and the Pinterest Complete The Look test set (Li et al., 2020). Project pages offer downloads, codebases, and supporting documentation for further research and industry application. This sharing ethos fosters reproducibility, benchmarking, and collaborative progress at the intersection of fashion, graphics, and machine learning.