Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning (2303.13756v1)

Published 24 Mar 2023 in cs.CV

Abstract: Image-based Virtual Try-ON aims to transfer an in-shop garment onto a specific person. Existing methods employ a global warping module to model the anisotropic deformation for different garment parts, which fails to preserve the semantic information of different parts when receiving challenging inputs (e.g, intricate human poses, difficult garments). Moreover, most of them directly warp the input garment to align with the boundary of the preserved region, which usually requires texture squeezing to meet the boundary shape constraint and thus leads to texture distortion. The above inferior performance hinders existing methods from real-world applications. To address these problems and take a step towards real-world virtual try-on, we propose a General-Purpose Virtual Try-ON framework, named GP-VTON, by developing an innovative Local-Flow Global-Parsing (LFGP) warping module and a Dynamic Gradient Truncation (DGT) training strategy. Specifically, compared with the previous global warping mechanism, LFGP employs local flows to warp garments parts individually, and assembles the local warped results via the global garment parsing, resulting in reasonable warped parts and a semantic-correct intact garment even with challenging inputs.On the other hand, our DGT training strategy dynamically truncates the gradient in the overlap area and the warped garment is no more required to meet the boundary constraint, which effectively avoids the texture squeezing problem. Furthermore, our GP-VTON can be easily extended to multi-category scenario and jointly trained by using data from different garment categories. Extensive experiments on two high-resolution benchmarks demonstrate our superiority over the existing state-of-the-art methods.

Citations (75)

Summary

  • The paper presents the GP-VTON framework, which enhances virtual try-on by combining local garment partitioning with global parsing for improved warping accuracy.
  • It employs a Dynamic Gradient Truncation training strategy to effectively prevent texture distortion and maintain garment integrity during transformations.
  • Experimental evaluations demonstrate GP-VTON’s superiority with higher SSIM scores and lower LPIPS values compared to existing state-of-the-art methods.

Overview of GP-VTON: Advancements in Virtual Try-on Systems

The paper presents GP-VTON, a novel framework aimed at enhancing the efficacy of image-based virtual try-on (VTON) systems. Traditional VTON methods encounter difficulties with complex garment parts and intricate human poses due to their reliance on global warping modules. These modules often lead to texture distortion and semantic inaccuracies, hindering practical use. GP-VTON introduces an innovative Local-Flow Global-Parsing (LFGP) warping module and the Dynamic Gradient Truncation (DGT) training strategy to address these challenges.

Key Contributions

GP-VTON is designed to proficiently map in-shop garments onto target images while maintaining semantic integrity and preserving texture details. The framework significantly improves upon previous methods by introducing:

  1. Local-Flow Global-Parsing (LFGP) Module: This component divides garments into local parts, each warped individually, which mitigates issues associated with anisotropic deformations common in traditional global approaches. This method preserves semantic correctness even with complicated inputs.
  2. Dynamic Gradient Truncation (DGT) Training Strategy: By dynamically truncating gradients in overlapping regions, the framework avoids texture squeezing and distortion. This strategy adapts based on the disparity between garment dimensions, ensuring that warped garments maintain their original shape and appearance.
  3. Multi-category Adaptability: GP-VTON extends beyond single-category garment try-ons. It incorporates a unified framework that accommodates various garment types, demonstrating flexibility in handling both upper and lower body garments and dresses.

Experimental Evaluation

GP-VTON was rigorously evaluated against state-of-the-art methods utilizing high-resolution benchmarks, VITON-HD and DressCode. Metrics such as Structural Similarity Index (SSIM), Fréchet Inception Distance (FID), and Perceptual Distance (LPIPS) were used to quantify performance.

  • Performance Gains: GP-VTON highlighted its superiority by consistently achieving higher SSIM scores, indicative of better image fidelity, and lower LPIPS values, suggesting enhanced perceptual similarity. The framework also achieved substantial improvements in mIoU, demonstrating the semantic precision of the warping results.
  • Robustness and Realism: The framework successfully addressed the challenges posed by complex human poses and intricate garment inputs. The integration of local flows and parsing mechanisms within LFGP ensured the generation of realistic and semantically coherent try-on results.

Implications and Future Directions

From a practical standpoint, GP-VTON enhances the applicability of virtual try-ons in real-world scenarios, offering potential to transform e-commerce platforms by providing higher accuracy in virtual fittings. Theoretical advancements include the framework's novel approach to garment partitioning and warping, which may inspire future adaptations and innovations in image synthesis.

Looking forward, the paper sets a foundation for expanding AI capabilities in virtual fashion, where dynamic and adaptive systems become integral to providing seamless and individualized experiences. Further research could explore the integration of additional garment categories and the application of reinforcement learning to optimize the warping process in real-time.

In summary, GP-VTON represents a significant stride in virtual try-on technology, providing a robust framework that balances practicality and performance while navigating the complexities inherent in garment and pose diversity.