Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dress Code: High-Resolution Multi-Category Virtual Try-On (2204.08532v2)

Published 18 Apr 2022 in cs.CV, cs.AI, cs.GR, and cs.MM

Abstract: Image-based virtual try-on strives to transfer the appearance of a clothing item onto the image of a target person. Prior work focuses mainly on upper-body clothes (e.g. t-shirts, shirts, and tops) and neglects full-body or lower-body items. This shortcoming arises from a main factor: current publicly available datasets for image-based virtual try-on do not account for this variety, thus limiting progress in the field. To address this deficiency, we introduce Dress Code, which contains images of multi-category clothes. Dress Code is more than 3x larger than publicly available datasets for image-based virtual try-on and features high-resolution paired images (1024x768) with front-view, full-body reference models. To generate HD try-on images with high visual quality and rich in details, we propose to learn fine-grained discriminating features. Specifically, we leverage a semantic-aware discriminator that makes predictions at pixel-level instead of image- or patch-level. Extensive experimental evaluation demonstrates that the proposed approach surpasses the baselines and state-of-the-art competitors in terms of visual quality and quantitative results. The Dress Code dataset is publicly available at https://github.com/aimagelab/dress-code.

Citations (93)

Summary

  • The paper introduces Dress Code, a large, diverse high-resolution dataset that includes various garment types to enhance virtual try-on research.
  • The paper presents a novel Pixel-Level Semantic-Aware Discriminator (PSAD) that significantly improves image realism during adversarial training.
  • The paper demonstrates through extensive benchmarking that the proposed architecture outperforms state-of-the-art methods in key metrics like SSIM, FID, and KID.

Analysis of "Dress Code: High-Resolution Multi-Category Virtual Try-On"

The paper "Dress Code: High-Resolution Multi-Category Virtual Try-On" presents an advancement in the field of image-based virtual try-on systems, introducing both a novel dataset and an improved architecture for generating high-quality, realistic try-on images. This research addresses key limitations in existing datasets and algorithms, ultimately contributing to the development and potential deployment of more robust virtual try-on systems in e-commerce.

Key Contributions

  1. Dress Code Dataset: The paper introduces Dress Code, a public dataset that is significantly larger than existing datasets, with over three times more images than VITON. It features high-resolution paired images and includes a diverse range of clothing categories, including upper-body, lower-body, and full-body garments. The dataset's availability is expected to enable further research and development in this area.
  2. Multi-Garment Support: Unlike many existing datasets that focus on upper-body items, Dress Code supports a broader range of garments, thus reflecting more realistic shopping scenarios and facilitating comprehensive evaluation of virtual try-on solutions.
  3. Pixel-Level Semantic-Aware Discriminator (PSAD): A key technical contribution of the paper is the introduction of PSAD. This component enhances image quality by integrating semantic understanding at the pixel level during the adversarial training process, resulting in more detailed and realistic imagery.
  4. Comprehensive Benchmarking: The research provides an extensive evaluation across various virtual try-on architectures and baselines, showcasing the superiority of the proposed discriminator in generating realistic results. The paper’s experimental validation includes robust comparisons using the Dress Code dataset, distinguishing it from other work.

Experimental Results

The paper shows that the proposed architecture with PSAD outperforms baseline methods such as CP-VTON, CIT, and other state-of-the-art approaches regarding quantitative measures like SSIM, FID, KID, and IS. Notably, PSAD demonstrates substantial improvements in realism and visual quality.

The paper further tests performance with higher-resolution inputs, effectively scaling results while maintaining competitive quality metrics. This scalability is crucial for real-world applications where higher-resolution outputs are often required.

Implications and Future Work

The implications are notable for both academia and industry, particularly in the fashion e-commerce space, where realistic virtual try-on capabilities can greatly enhance user experience and sales conversion rates. The ability to accurately depict a wide range of garments on diverse body types at high resolution addresses significant barriers to adoption.

Future work may explore further optimization of PSAD for other generative tasks, refining techniques for different garment types, or extending methodologies for even higher resolution outputs. One could also integrate multi-modal data, such as incorporating 3D data, to enhance realism further.

Conclusion

The paper makes a substantial contribution to the virtual try-on field by presenting Dress Code, a diverse and high-resolution dataset, alongside a technically sound architecture in PSAD. These elements combine to deliver state-of-the-art performance, advancing research capabilities and paving the way for compelling applications in the digital commerce space. The public availability of Dress Code is expected to catalyze future developments in virtual try-on technologies, fostering innovation and application readiness.

Github Logo Streamline Icon: https://streamlinehq.com